riscv / riscv-isa-manual

RISC-V Instruction Set Manual
https://riscv.org/
Creative Commons Attribution 4.0 International
3.72k stars 647 forks source link

Hypervisor extension name: inconsistency? #781

Open a4lg opened 3 years ago

a4lg commented 3 years ago

EDIT (2021-11-25): I replaced the description entirely (originally here).

Discovery

I found this issue when testing my very first Linux patches, that...

References:

In those patches, I wrote "correct" "riscv,isa" parser. "riscv,isa" is a device tree entry which describes ISA and extensions implemented. But I noticed that some simulators can pass invalid ISA string as "riscv,isa" entry.

Background: Single-letter extension "H" is invalid as of now

It's important to understand. If we make single-letter extension "H" valid, we need to make changes to the ISA Manual.

Note that, allowing multi-letter and single-letter "H*" extensions is nothing like allowing single-letter "H" only. Because multi-letter extension must be delimited by '_', single-letter "H" extension will also need it if we accept multi-letter "H*" extensions.

Issue 1: misa Register is not Set of Extensions (itself)

Exists in:

QEMU caused my patchset to detect undefined "Su" extension. Because default QEMU (virt platform) generates "riscv,isa" string "rv64imafdcsu", last two characters are handled as a multi-letter extension name (as per ISA Naming Conventions). This is caused by invalid ISA string generated by QEMU.

It looks harmless. "Correct" parser detected invalid extension but that name is not defined. And it is harmless... unless we start to use some other extensions.

If this kind of issue exists in another software, it's not good. So, I looked into Spike, a RISC-V ISA Simulator... where I found a serious problem.

References:

Issue 2: Single-letter "H" is handled as a REAL extension

Exists in:

Spike, a RISC-V ISA Simulator accepts "ISA string" as a parameter (--isa=[ISA]) and passes it (mostly unmodified) to the device tree ("riscv,isa" entry). Surprisingly, Spike accepts "H" as a valid single-letter extension.

QEMU also has a similar problem (-cpu argument value such as rv64,x-h=on affects misa register and then sets "H" as a single-letter extension as described in Issue 1).

Then, I reached to the core, RISC-V KVM, what I accidentally broke it.

RISC-V KVM checks existence of extension H. Where those set of supported extensions come from? Yes, the device tree ("riscv,isa"). That's exactly why I insist that there's something we need to address. I think Mr. Patel (@avpatel) will be needed to be aware of that.

If we parse the ISA string correctly (as per current ISA Manual), RISC-V KVM stops working because "H" is not a valid single-letter extension name for now. That's what I had to propose in the patchset (for warning).

If we consider that current ISA Manual is correct and querying "H" extension from the ISA string, we are doing something wrong. By the way, misa.H is completely irrelevant.

References:

Options to Take (only examples)

  1. Add multi-letter extension name like "Zh" or "Sh" indicating whether the hypervisor extension is available and query new extension name from RISC-V KVM I think this is the cleanest solution if we make extension namespace unchanged. However, this option will require changes to at least three software: QEMU, Spike (to generate new extension name correctly) and RISC-V KVM. Also, it requires small changes to the ISA Manual.
  2. Accept "H" as a single-letter extension (but not multi-letter "H*") and pass "H"-extension as ISA string (through the device tree, "riscv,isa")
    This option would make software changes minimal (even if not none). However, ISA Manual must be changed (changes would be relatively large).
  3. Pass existence of the hypervisor extension by something else (different entry than "riscv,isa")
    This would keep the extension namespace (and the ISA Manual) unchanged. But just like option 1, at least three software need changes.
avpatel commented 3 years ago

Well, "H" is not a separate privilege mode in itself like "M", "S" and "V" modes because it adds additional capability to S-mode so that we can virtualize "S" and "U" mode. Based on this interpretation, Hypervisor extension is much like any other extensions.

There is also misa.H bit which can be writable by software and can be used to disable H-extension.

Regards, Anup

gfavor commented 3 years ago

The Naming Conventions chapter describes both single-letter (Section 27.3) and multi-letter (section 27.6) extensions names. The 'H' extension is one of the former (and has a corresponding misa letter).

omasanori commented 3 years ago

From 27.7:

Standard supervisor-level instruction-set extensions are defined in Volume II, but are named using “S” as a prefix, followed by an alphabetical name and an optional version number. Supervisor-level extensions must be separated from other multi-letter extensions by an underscore.

From 27.8:

Standard hypervisor-level instruction-set extensions are named like supervisor-level extensions, but beginning with the letter “H” instead of the letter “S”.

Thus standard H-level extensions shall be named as multi-letter ones, which is what 2. above OP says IMHO.

omasanori commented 3 years ago

If the single-letter H extension would be defined, should the prefix of standard H-level extensions be Zxh or something, just like that of standard M-level extensions is not M but Zxm?

gfavor commented 3 years ago

The extension naming conventions are evolving (in minor ways) and are due to be updated as RVI gets a couple of years of experience under its belt with a whole slew of new extensions coming along. For Priv-related extensions the naming convention being established for add-ons to the Priv spec is "Sxzzz" where 'S' represents all Priv-related extensions, 'x' represents a category (e.g. 'v' for virt-mem extensions, 'm' for machine-level extensions, 'h' for hypervisor extensions, etc.), and 'zzz' is a multi-letter name for the individual extension. But the H extension itself corresponds to misa.H and has a single-letter 'H' name (like the other single-letter extensions that have a corresponding misa letter). Just like add-on unpriv integer extensions (on top of the various single-letter unpriv integer extensions) are named Zi, add-on H extensions will be named Sh.

Greg

On Tue, Nov 23, 2021 at 12:50 AM Masanori Ogino @.***> wrote:

If the single-letter H extension would be defined, should the prefix standard H-level extension be Zxh or something, like that of standard M-level extensions is not M but Zxm?

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/riscv/riscv-isa-manual/issues/781#issuecomment-976292460, or unsubscribe https://github.com/notifications/unsubscribe-auth/ALLX6GXG3NJIMXTHI5KT3BLUNNIW7ANCNFSM5INVN6LQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

omasanori commented 3 years ago

I see; I didn't know the Sxzzz convention, and the manual should be updated if it is so, but it looks fine. There are no standard H-level extensions at this moment, and the impact would be minimal. Thank you for answering, @gfavor!

atishp04 commented 3 years ago

@gfavor Slightly different topic: Does the software need to parse the versioning scheme for ISA extensions i.e. svpbmt, Zicbom/z? As per the chapter 27.4 in unpriv ISA, ISA extensions can have versions. However, there were recent discussions about all ratified ISA extensions to have version 1.0. Did I misunderstand something ?

pdonahue-ventana commented 3 years ago

See https://github.com/riscv/riscv-isa-manual/pull/688 for what I believe are the actual extension naming conventions.

a4lg commented 3 years ago

Sorry, I think how I started the story was very wrong.

I replaced the description completely and I hope everyone now understands that there's something needs to be addressed.

atishp04 commented 3 years ago

@pdonahue-ventana : Any thoughts on my question about the versioning scheme. I understand that standard extensions (single letter ones and ones starting with Z) will have version numbers scheme defined in chapter 27.4

Is that versioning scheme applicable for S/H/M level extensions as well ?

a4lg commented 3 years ago

@atishp04 At least, applicable on S/H. See chapters 27.7 and 27.8 (I assume that it's applicable to M-mode too, but it's not clear enough).

From 27.7 “Supervisor-level Instruction-Set Extensions” (emphasized by me):

named using “S” as a prefix, followed by an alphabetical name and an optional version number.

From 27.8 “Hypervisor-level Instruction-Set Extensions” (emphasized by me):

named like supervisor-level extensions, but beginning with the letter “H” instead of the letter “S”.

From 27.9 “Machine-level Instruction-Set Extensions”:

Standard machine-level instruction-set extensions are prefixed with the three letters “Zxm”.

There's no mention like “named like supervisor-level extensions” on 27.9.

atishp04 commented 3 years ago

I saw that. As per the ratification policy, however, all the ISA extensions (Zxx, Ssxxx, Svxxx) are supposed to have only 1.0. There won't be any 2.0 version for any of the extension. Instead, it will be called another extension.

That's why, I am bit confused here.

a4lg commented 3 years ago

Possibly, version number is still applicable but the value is constrained (only 1 and 1p0 [if there's no extra 0 digits] are allowed) ?

atishp04 commented 3 years ago

May be. But we need to confirm this as kernel will no longer need to parse the version number for all these extensions. I will start a thread in priv-spec mailing list.

gfavor commented 3 years ago

The naming and numbering of extensions has evolved since 2-3 years ago when sections 27.7-27.9 were written. I can't speak for exactly how these sections will change when this chapter gets updated (which is due to be done in the coming months by Krste), but naming of new extensions (including all the ones just ratified) have been done to be consistent with where naming is evolving to (for Priv-relatd extensions, they are all things like Sm, Ss, Sv*).

And Atish is correct that all ISA extensions will only have a 1.0 version once ratified. There won't be 2.0 versions. Leaving aside the special case of Priv 1.11/1.12 (which is now Sm1p12 and Ss1p12 I believe), no other official extension names include '1p0'.

So, in short, don't get too hung up on the current, soon to be updated, 2-3 year old text.

atishp04 commented 3 years ago

Thanks @gfavor for the clarification. That means supervisor mode software don't need to bother with version number parsing for ISA extensions.

a4lg commented 3 years ago

Thanks, @gfavor! I hope some resolution about "H"-extension is made soon (allowing single-letter "H" extension (and prohibiting multi-letter "H*") will minimize software changes).

gfavor commented 3 years ago

The H extension has been and continues to have a single-letter extension name - which also corresponds to misa.H. Tentatively the plan for future hypervisor-related extensions is for them to be named Sh - following the Sm, Ss, Sv patterns for other Priv-related extensions.

Note that Unpriv-related extension are generally following the Z* pattern, where for example integer unpriv extensions are named Zi, floating point related extensions are named Zf and Zd, and vector-related extensions are named Zv*.

a4lg commented 3 years ago

@gfavor That response is gold for me. Now I can reflect it to my Linux kernel patch.

gfavor commented 3 years ago

And to be complete, there are the early, rather fundamental, Unpriv extensions that have single-letter names (e.g. I, M, A, C, F, D). And there are the RV32E, RV32I, and RV64I "base" ISA extensions that everything else builds on top of.

atishp04 commented 2 years ago

@gfavor @aswaterman Is there any official update in the spec on extension naming scheme around hypervisor and its extensions ? There will be a gcc release in the near future. It would be good have ISA string rules updated so that toolchain/linux/qemu projects can follow/point to it.

pdonahue-ventana commented 2 years ago

I think that https://github.com/riscv/riscv-isa-manual/pull/688/files is supposed to address that, but that has been neither approved nor rejected.

gfavor commented 2 years ago

As mentioned in https://github.com/riscv/riscv-isa-manual/issues/781#issuecomment-976805196:

For Priv-related extensions the naming convention being established for add-ons to the Priv spec is "Sxzzz" where 'S' represents all Priv-related extensions, 'x' represents a category (e.g. 'v' for virt-mem extensions, 'm' for machine-level extensions, 'h' for hypervisor extensions, etc.), and 'zzz' is a multi-letter name for the individual extension. But the H extension itself corresponds to misa.H and has a single-letter 'H' name (like the other single-letter extensions that have a corresponding misa letter).

The extension naming chapter in the ISA manuals remains to be updated accordingly.

atishp04 commented 2 years ago

Thanks Greg. I was just checking if there is any PR that is pending with such updates. Once it is available in the spec, it can be officially referred in the implementations.