Multiple IMSICs with multiple VS interrupt files for a hart may bring huge resource overhead

xlxxxxxl commented 1 day ago

Background: There are discussions on this topic in the smmtt community: https://lists.riscv.org/g/tech-smmtt/topic/multiple_imsic_for_a_hart/106345304 As described by Vedvyas Shanbhogue on the floor 91:

Many implementation choices exist such as physically instantiating multiple IMSICs, shared IMSIC logic with separate interrupt files.

Currently, the smmtt spec only describes the first implementation, which is to instantiate multiple IMSICs physically. This implementation may bring a problem. In RISCV-V Server SoC specification it is required that "IMSIC MUST support at least 5 VS-mode interrupt files".

1 IMSIC implements at least 5 VS int files; 2 supervisor domains, i.e. 2 IMSICs, need to implement 10 VS int files; And 4 supervisor domains, i.e. 4 IMSICs, need to implement 20 VS int files. If there are more SDs, the number of VS files instantiated in a hart will be greater. And for Server SoC, the number of harts will also be very large. If this is implemented, the logic resources and area overhead will be very exaggerated.

But generally, the number of virtual harts (in VMs or TVMs) running on per physical hart is only 4~5. The number of virtual harts that a physical hart can carry is very limited. In the multiple SDs implementation of SERVER-level SOC, each physical hart has so many VS interrupt files (by instantiating IMSIC for each SD), but only a small number of VS interrupt files are actually used, resulting in a large waste of logic resources and area.

This may be a big problem in actual implementation. Therefore, the second implementation scheme, that is, sharing IMSIC logic with multiple sets of S/VS files for multiple SDs, seems more feasible. In this scheme, we can configure which VS files are exposed to active SD or deactive SD in RDSM. In this way, some VS files can be shared among multiple SDs to reduce the defects of excessive resource and area overhead brought by the first implementation scheme.

Is it possible to add description and supplement to the second implementation option in smmtt spec to guide the actual development? It seems that new register functions need to be introduced.

Please refer to the content of the floor 85, 91, 92, 93, 94, 95 of the discussion page linked above.

Please also help correct me if my understanding is wrong in any way, thanks!

SiFiveHolland commented 1 day ago

If your hardware has multiple physical IMSIC instances, there is no requirement that they have the same features, as long as each instance is described accurately in the DT/ACPI. For example, you could have one IMSIC for the primary SD with 5 VS-mode interrupt files, and 3 IMSICs for the other SDs each with 0 or 1 VS-mode interrupt file. Each SD sees its own IMSIC in the DT/ACPI, and does not know anything about the IMSICs of other SDs.

If your hardware can move VS-mode interrupt files between S-mode IMSIC groups, then a static allocation of the VS-mode interrupt files would look just like what I described above. You would use some platform-specific method (e.g. custom CSR) to configure the IMSICs at boot to match what is described in each SD's DT/ACPI, and the RDSM would be otherwise unaware of this configurability. So this could be implemented without any changes to the spec, though there may be some benefit to standardizing this IMSIC configuration interface.

If you want to dynamically allocate VS-mode interrupt files to SDs, then the software/hardware is more complicated. You would pretend that each IMSIC instance has the maximum number of VS-mode interrupt files in the DT/ACPI. Then, you would need to trap writes to hstatus.VGEIN to do the actual allocation, possibly context switching some other SD's existing VS-mode IMSIC context to/from RAM. This would require runtime support from the RDSM, so if we want to support this use case, it would be good to standardize the register interface.

Do you have any suggestions for what the interface for configuring the IMSIC would look like? Should this be developed in conjunction with the AIA spec?

xlxxxxxl commented 11 hours ago

Thanks for your reply！

Combined with your description, please let me summarize the current several implementation ideas.

Implementation 1: Using multiple physical IMSIC instances, without hardware support static allocation of the VS interrupt files

If your hardware has multiple physical IMSIC instances, there is no requirement that they have the same features, as long as each instance is described accurately in the DT/ACPI. For example, you could have one IMSIC for the primary SD with 5 VS-mode interrupt files, and 3 IMSICs for the other SDs each with 0 or 1 VS-mode interrupt file. Each SD sees its own IMSIC in the DT/ACPI, and does not know anything about the IMSICs of other SDs.

But there are two suspicious points that need to be clarified first.

The point is as described in the below quotation. The smmtt spec v0.2.0 does not specify whether the number of VS interrupt files in each physical IMSIC instance can be inconsistent. But according to RISCV-V Server SoC specification "IMSIC MUST support at least 5 VS-mode interrupt files", we will understand that at least 5 VS interrupt files are required for each active SD. So each physical IMSIC instance should have at least 5 VS interrupt files. So the point seems to contradict RISCV-V Server SoC spec.

there is no requirement that they have the same features
The point is as described in the below quotation. This requires OS ecosystem support. It is need to promote OS to add supervisor domain dimension in interrupt management.

Each SD sees its own IMSIC in the DT/ACPI

Please help explain the above doubts before determining the feasibility of this Implementation.

Implementation 2: Using multiple physical IMSIC instances, with hardware support static allocation of the VS interrupt files once

If your hardware can move VS-mode interrupt files between S-mode IMSIC groups, then a static allocation of the VS-mode interrupt files would look just like what I described above. You would use some platform-specific method (e.g. custom CSR) to configure the IMSICs at boot to match what is described in each SD's DT/ACPI, and the RDSM would be otherwise unaware of this configurability. So this could be implemented without any changes to the spec, though there may be some benefit to standardizing this IMSIC configuration interface.

This implementation still needs to solve the above two suspicious points first. And I personally think that the hardware may not consider implementing the function of statically allocating VS interrupt files between multiple physical IMSIC instances.

Implementation 3: Using single physical IMSIC but multiple sets of S/VS interrupt files, with support dynamic allocation of S/VS interrupt files to SDs

If you want to dynamically allocate VS-mode interrupt files to SDs, then the software/hardware is more complicated. You would pretend that each IMSIC instance has the maximum number of VS-mode interrupt files in the DT/ACPI. Then, you would need to trap writes to hstatus.VGEIN to do the actual allocation, possibly context switching some other SD's existing VS-mode IMSIC context to/from RAM. This would require runtime support from the RDSM, so if we want to support this use case, it would be good to standardize the register interface.

My understanding here is a little different from yours. Use a single physical IMSIC instead of multiple physical IMSIC instances.

I don't have any more ideas about the specific implementation details of dynamic allocation. However, from the previous discussions on this topic in the smmtt community, I think there is already a good implementation direction. Please let me quote some key descriptions. We can continue to discuss based on this.

https://lists.riscv.org/g/tech-smmtt/topic/multiple_imsic_for_a_hart/106345304

from Alvin Chang on floor # 89:

If we arrange "single" IMSIC's S- and VS- interrupt files as:

{ S[1], G[1], G[2], ......, G[i] }, { S[2], G[i+1], G[i+2], ......, G[j] }, ... { S[x], G[y+1], G[y+2], ......, G[z] }

Then msdcfg.SDICN selects one of these groups in the IMSIC. It seems this does not violate AIA, but I notice a constraint in AIA section 3.1:

The number of guest interrupt files an IMSIC has for virtual harts is exactly GEILEN ......

IIUC, it means the maximum G[z] is G[63] in my example, for RV64. That means each { S[1], G[1], G[2], ......, G[i] } group should not have all 63 VS-files in that IMSIC, Otherwise, other supervisor domains won't get any available VS-files.

from Siqi Zhao on floor # 92:

We could work around this by not grouping S-mode interrupt files and VS-mode interrupt files in the hardware. We could say msdcfg.SDICN only selects the active S-mode interrupt file, and let the TSM select the active VS-file exactly as how a normal hypervisor does, i.e. using the hstatus.VGEIN bit. This is OK since the TSM is in the TCB.

In order to isolate the trusted set of VS-mode files from untrusted set of VS-mode files, a M-mode bit mask could be introduced, 'msdvsifcfg' maybe. Each bit in it corresponds to a VS-mode file, a set bit means that VS-mode file should only be used by SDs. An attempt to access when not executing SDs raises an illegal instruction exception.

from Siqi Zhao on floor # 94:

My idea is that there is only one set of VS-mode interrupt file. Each of the files could be designated as active or inactive. By setting msdvsifcfg, RDSM exposes VS files suitable for the current SD. Then, the TSM sets hstatus.VGEIN to select among the VS-mode interrupt file exposed for the current SD. If for any reason, hstatus.VGEIN is set to the 'hidden' VS file, there should be an exception.

The responsibility is divided up. RDSM exposes the VS files in chunks for each SD, then TSM in a SD selects among the VS files in its chunk.

from Alvin Chang on floor # 95:

IMHO, set to the 'hidden' VS file should be avoided. According to Priv. Spec. section "Hypervisor Guest External Interrupt Registers (hgeip and hgeie)": The least-significant bits are implemented first, apart from bit 0. Hence, if GEILEN is nonzero, bits GEILEN:1 shall be writable in hgeie, and all other bit positions shall be read-only zeros in both hgeip and hgeie.

It seems the indexes of available VS files should be contiguous. Since each IMSIC only has up to 63 physical VS-files, we can format msdvsifcfg as: msdvsifcfg[11:6] - "number" of selected VS files (also GEILEN for this supervisor domain) msdvsifcfg[5:0] - "start index" of selected VS files

For example: - msdvsifcfg[11:6] = 6 - msdvsifcfg[5:0] = 7 It means that 6 IMSIC's physical VS files are selected to this supervisor domain, and the files are indexed as 7 ~ 12. Base on this selection, when hstatus.VGEIN = 1, it actually selects IMSIC's physical VS file 7, and hstatus.VGEIN = 2 actually selects physical VS file 8 , etc.

So we still stick to the Priv. Spec. about GEILEN.

riscv / riscv-smmtt