Closed xypron closed 7 months ago
Type 4 note in the spec: "NOTE One structure is provided for each processor instance in a system. For example, a system that supports up to two processors includes two Processor Information structures — even if only one processor is currently installed. Software that interprets the SMBIOS information can count the Processor Information structures to determine the maximum possible configuration of the system"
Type 44 note in the spec: "The information in this structure defines the processor additional information in case SMBIOS type 4 is not sufficient to describe processor characteristics. The SMBIOS type 44 structure has a reference handle field to link back to the related SMBIOS type 4 structure. There may be multiple SMBIOS type 44 structures linked to the same SMBIOS type 4 structure. For example, when cores are not identical in a processor, SMBIOS type 44 structures describe different core-specific information."
Sadly, the spec doesn't define 'processor', but my read is that a processor is equivalent to a socket, more or less. I think it's correct that Type 4 is required for Type 44 to be present.
My reading is 1 type 4 per hart and 1 type 44 per hart, where a hart is a "logical processor", as in a unit of execution for the OS.
We can clarify this in the BRS spec (and then further down in the SMBIOS spec). Note that the SMBIOS spec already mandates type 4, which is why our spec doesn't list it.
@andreiw
On my x86 laptop I find exactly 1 type 4 table with:
Type 4 would not have these fields if one type 4 table per hart were to be expected.
The specification has these words for type 4: " a separate structure instance is provided for each system processor socket/slot".
Got it, thanks, lazy reading on my part. 1 Type 4 (which always refers to the first hart in that "socket") and multiple Type 44.
I still don't see why we need to call out Type 4 in the BRS. SMBIOS spec already marks Type 4 as required - see 6.2 Required structures and data on page 28.
We seem to agree that type 4 does not refer to a single hart. So "refers to the first hart in that socket" cannot be correct.
Think of the FU740 SoC of the Unmatched board where the first hart according to the device-tree is the rv64imac S7 hart which is not used by Linux while the other rv64imacdf harts are used by Linux. Other SoCs may mix a 32bit hart 0 with 64bit harts.
The SMBIOS type 4 structure should refer to harts used by the OS. But this does not necessarily include the first hart.
Here's what the the DMTF spec says:
7.5.3.4 RISC-V-class CPUs: For RISC-V class CPUs, the processor ID contains a QWORD Machine Vendor ID CSR (mvendorid) of RISC-V processor hart 0. More information of RISC-V class CPU feature is described in RISC-V processor additional information (SMBIOS structure Type 44).
It may be worthwhile to refine this to refer to the first hart usable by the OS environment (and I'm sure we could split hairs defining "usable" as well without encroaching on platform spec space). We have this problem with ACPI too, because we're presumably describing any hart in the system - we're not limiting this to capability or even being in the same coherency domain (e.g. as far as seeing the same DVM-like updates from other harts).
Hart ID 0 may never become available to the OS. Either because no hart ID 0 exists or because it is not dedicated to the OS in question. - If multiple OSes run in parallel, only one of them may have access to a hart with ID 0. - Would we still want to describe all harts in this case or only the ones that are available to the OS?
Instead of "usable" you could refer to the boot hart.
Agreed on hart 0 - it may not exist (in principle, or because of errata, or because it is not an application CPU). So maybe relaxing this to a "boot hart" is a start.
"If multiple OSes run" - if there's some kind of partitioning going on (in hardware or via VM), I'd say there's isolation that would have to exist, so the SMBIOS/ACPI/DT would only describe the subset visible.
As far as describing "special" processors (different set of characteristics than then boot CPU)... I imagine if it is a CPU that could /possibly/ be used by general purpose software then it has to be described. If it is used by the platform software (some kind of firmware, system control processor OS etc) then it should not be described to the OS.
Thoughts?
It seems that HR_010 already addresses this.
Agreed on hart 0 - it may not exist (in principle, or because of errata, or because it is not an application CPU). So maybe relaxing this to a "boot hart" is a start.
Section 3.1.5 of the Priv ISA specifies: "Hart IDs might not necessarily be numbered contiguously in a multiprocessor system, but at least one hart must have a hart ID of zero. Hart IDs must be unique within the execution environment."
So physically, the hart ID of 0 has to exist. Perhaps the "it may not exist" is a statement about "logical CPU number" or some such abstraction that exists between firmware and the OS?
Agreed on hart 0 - it may not exist (in principle, or because of errata, or because it is not an application CPU). So maybe relaxing this to a "boot hart" is a start.
Section 3.1.5 of the Priv ISA specifies: "Hart IDs might not necessarily be numbered contiguously in a multiprocessor system, but at least one hart must have a hart ID of zero. Hart IDs must be unique within the execution environment."
That comment and the mhartid
register it refers to are only relevant to machine mode, which the BRS is not concerned with.
Then under what condition is below true: "Agreed on hart 0 - it may not exist (in principle, or because of errata, or because it is not an application CPU)." "Hart ID 0 may never become available to the OS. Either because no hart ID 0 exists or because it is not dedicated to the OS in question. - If multiple OSes run in parallel, only one of them may have access to a hart with ID 0."
If we are referring to number cpu_num() here and not "Hart ID" i.e. hardware thread ID, then is there a reason to start an OS without a CPU numbered 0?
I'm not sure what "running multiple OSes in parallel" still means. If you're running under a hypervisor or similar, then you can renumber the hart IDs (from the S mode view) to always begin at hart ID 0.
I would like to come back to the FU740 SoC (Unmatched) example, where hart ID 0 (rv64imac) is not an "application processor". It wouldn't be the boot hart in UEFI, but perhaps it could be used in Linux via a driver for $WHATEVER. In this situation, is it fair to say the hart should still be described in ACPI/SMBIOS? From ACPI perspective, the relationship between interrupt controller and hart is still important to note. From SMBIOS perspective I'll agree with @xypron that the overall value is a bit dubious, but let's say the manufacturer ID is really important for some errata handling...
So physically, the hart ID of 0 has to exist. Perhaps the "it may not exist" is a statement about "logical CPU number" or some such abstraction that exists between firmware and the OS?
But what if hart 0 is an auxiliary rv32 core in a system of 128 RV64GC harts? What if that hart 0 will never be used by a booted OS because it runs some platform-specific goo (e.g. owns the firmware NV flash and runs the secure vars code...). Surely we wouldn't describe hart 0 in such a system?
I'm not sure what "running multiple OSes in parallel" still means. If you're running under a hypervisor or similar, then you can renumber the hart IDs (from the S mode view) to always begin at hart ID 0.
You don't need the hypervisior extension to run multiple operating systems. The PolarFire Icicle Kit HSS firmware just uses different entry points for the harts dedicated to U-Boot/Linux and to the RTOS as defined in the header of the loaded binary.
But what if hart 0 is an auxiliary rv32 core in a system of 128 RV64GC harts? What if that hart 0 will never be used by a booted OS because it runs some platform-specific goo (e.g. owns the firmware NV flash and runs the secure vars code...). Surely we wouldn't describe hart 0 in such a system?
Yes, we wont describe hart 0 in such system and we would need to tell the OS there is one less hart in the system as well. hartid is a physical machine concept. The mhartid
is readable only by the machine mode. I think the machine mode when it starts operating systems in such systems may want to provide a logical processor number. So the hartid that the OS sees is what the machine mode tells it is the hartid.
So we can go two ways here, we can either mandate that "all hart ids start at zero", with a note that "OS visible hart id are the /logical processor numbers/ and don't have to match the actual mhartid. Or we can simply fix any references to hart 0, rewording these as "the id of the first hart meeting the hart requirements"
Currently in smbios.adoc type 44 is required while type 4 is not even mentioned.
Type 44 is meant to extend the information in type 4. Field Referenced Handle must refer to the corresponding type 4 structure. So if type 44 is required type 4 must be required too.
smbios.adoc describes a field hart ID. This requires one Processor-Specific Data structure per hart. It remains unclear if one type 44 structure should be created per hart or if there shall be one type 44 structure per processor and an array of Processor-Specific Data structures relating to this processor.
Do we need one type 4 structure per hart, or one per processor? Do we need one type 44 structure per hart, or one per processor?