riscv-non-isa / riscv-iommu

RISC-V IOMMU Specification
https://jira.riscv.org/browse/RVG-55
Creative Commons Attribution 4.0 International
98 stars 17 forks source link

Question about ATS invalidation timeout generation #310

Closed viktoryou closed 6 months ago

viktoryou commented 7 months ago

From the spec,

The ATS.INVAL command completes when an “Invalidation Completion” response message is received from the device or a protocol-defined timeout occurs while waiting for a response.

How to understand the protocol-defined timeout? Should IOMMU set a timer and get timeout when device doesn't response within self-defined time limit? If this is right, what if the device finally response the same invalidation after timeout is detected?

Or a timeout message would be sent into IOMMU by certain protocol? For example, error bit in DTI_ATS_INV_COMP indicates that invalidation could not be completed. However, I could not see further description about timeout error. I am wondering if device is responsible to maintain a timer when invalidation is going.

Which is the right way to generate the information of ATS invalidation timeout?

ved-rivos commented 7 months ago

How to understand the protocol-defined timeout?

The timeout is specified by the PCIe ATS protocol.

Should IOMMU set a timer and get timeout when device doesn't response within self-defined time limit? If this is right, what if the device finally response the same invalidation after timeout is detected?

The handling of an invalidation completion for an ITag that has no outstanding Invalidation Request is implementation specific - e.g. to treat it as an Unexpected Completion (UC).

Or a timeout message would be sent into IOMMU by certain protocol? For example, error bit in DTI_ATS_INV_COMP indicates that invalidation could not be completed. However, I could not see further description about timeout error. I am wondering if device is responsible to maintain a timer when invalidation is going.

The device is not responsible for starting a timer. A timeout usually occurs because the device and/or the link to the device has become non-functional and a timeout may be indicative of the need to perform recovery actions such as re-establishing the link or resetting the device to restore operation.

Logically the IOMMU starts a timer for the expected completion. Whether the timer is actually implemented in the IOMMU or in the Root Port (RP) is implementation specific. When a devices uses multiple traffic classes it may respond to a single invalidation request with multiple - one per traffic class - completions. Logically the IOMMU considers an invalidation request as completed when all completions have been received. Whether these multiple completions are tracked in the RP or in the IOMMU to determine when all necessary completions have been received is also implementation specific.

viktoryou commented 6 months ago

That's clear. Thanks.

yanhe234 commented 1 month ago

Hello sir, what do the G and S of payld in ATS.INVAL mean respectively? GLOBAL? pagesize>4KB ?

ved-rivos commented 1 month ago

Their definition is provided by the PCIe specification. G: The Global Invalidate bit indicates that the Invalidation Request Message affects all PASID values - see section 10.3.8 of PCIe 6.0 specification for more details. S: The S field is used to indicate if the range being invalidated is greater than 4096 bytes - see section 10.2.3.1 and 10.2.3.2 of the PCIe 6.0 specification for more details.

yanhe234 commented 1 month ago

Very clear, thank you for your reply!