riscv-software-src / riscof

BSD 3-Clause "New" or "Revised" License
61 stars 39 forks source link

Input ISA string does not match accepted canonical ordering #67

Open marcfedorow opened 1 year ago

marcfedorow commented 1 year ago

I get this message in several situations where IMO my ISA string should be accepted. Some examples: RV32IMCZicsr_Zifencei_Zba_Zbb_Zbc_Zbs_Zkn_Zks_Zkt_Xfoo (Xfoo not recognized) RV32IMCNZicsr_Zifencei (says U must be with N, RV32IMCNUZicsr_Zifencei is accepted but there is no requirement to place U in the ISA string AFAIK) RV32IMC_Zicsr_Zifencei rv32imczicsr_zifencei

So the ISA-string parsing doesn't work more than works. It is extremily bad that I must have two different strings / some kind of converter between GNU-GCC-accepted and riscof formats (e.g. GNU GCC rejects any CAPITAL letter).

pawks commented 1 year ago

I get this message in several situations where IMO my ISA string should be accepted. Some examples: RV32IMCZicsr_Zifencei_Zba_Zbb_Zbc_Zbs_Zkn_Zks_Zkt_Xfoo (Xfoo not recognized)

Xfoo is not a ratified extension and hence it will not be recognized. Only ratified extensions are supported as of now. X type extensions should probably always be allowed. This raises additional questions as to what isa string should be used for compiling the tests. An implementation might use any of their custom X instructions in the RVMODEL macros. Thoughts? @neelgala @allenbaum

RV32IMCNZicsr_Zifencei (says U must be with N, RV32IMCNUZicsr_Zifencei is accepted but there is no requirement to place U in the ISA string AFAIK)

The user mode is entirely optional and hence any implementation which supports the user mode should include U in the ISA string. The ISA string in question here describes the configuration of the core and not just the march option for the toolchains. U is a required extension for N.

So the ISA-string parsing doesn't work more than works. It is extremily bad that I must have two different strings / some kind of converter between GNU-GCC-accepted and riscof formats (e.g. GNU GCC rejects any CAPITAL letter).

The ISA string is formatted according to the naming conventions in the ISA manual. This is intentional because riscof isn't made to support only gcc and any toolchain can be used with it. The plugin is responsible for converting the canonical ISA string into the respective march for the toolchain.

neelgala commented 1 year ago

Riscof now uses "riscv-config" internally for the isa string validation so these issues might be best tracked there. Basically all ISA strings to be used by RISCOF must be riscv-config compatible.

riscv-config doesn't support defining X extensions as of today (and I am guessing out-of-the-box GCC doesn't either - do correct me if I am mistaken). But I aprreciate the need for it.. and we will need to go back to regex in riscv-config to see how it can be done.

The riscv-config ISA string cannot be directly used as gcc march in any case (even is riscv-config supported lower-case). Consider the S extension. The riscv-config ISA string will require specifying S for riscof to choose and run supervisor extension tests.. while the gcc march option will not accept s as a valid option. So you will have to implement a riscv-config ISA to march string converter in your plugin irrespective of upper/lower case support.

PS: If capitalization is the only issue for you currently, you can use <my_str_var>.lower() in python to convert a string to all lower case.

marcfedorow commented 1 year ago

(and I am guessing out-of-the-box GCC doesn't either - do correct me if I am mistaken).

GCC just skips [szx]-sub-extensions if it does not know what is it. I thins riscof should do the same

marcfedorow commented 1 year ago

The user mode is entirely optional and hence any implementation which supports the user mode should include U in the ISA string

GCC says: riscv64-unknown-elf-gcc: error: '-march=rv64iu': unsupported ISA subset 'u' If this format is not actually an ISA string but a riscv-config-string, I would love to have some documentation accessible on error because the simple reading of RISCV ISA spec gives no actual clues about what is not accepted in the actually canonical ordered string

pawks commented 1 year ago

GCC says: riscv64-unknown-elf-gcc: error: '-march=rv64iu': unsupported ISA subset 'u' If this format is not actually an ISA string but a riscv-config-string, I would love to have some documentation accessible on error because the simple reading of RISCV ISA spec gives no actual clues about what is not accepted in the actually canonical ordered string

Chapter 28 of the RISC-V ISA spec describes a majority of the rules for canonical ISA strings. Some of the others available in the chapters for the respective extensions(like F is required for D to be present and so on).

marcfedorow commented 1 year ago

Chapter 28 of the RISC-V ISA spec describes a majority of the rules for canonical ISA strings.

This is exactly what I am claiming false. For example, F is not required if D is presented in the RISC-V ISA spec compatible ISA-string. On the contrary, D implies F. E.g. spec treats G as "IMADZifencei" (no F in this string!). riscof treats "implies" as "requires" though (which makes this check incompatible with spec 2.0).

Without a proper stand-alone documentation of what strings are actually acceptable (i.e. what is the string format) composing a valid riscof ISA string is a game of guess. It is virtually impossible to find out what is wrong with the valid compiler-acceptable ISA-string with this vague exception description.

neelgala commented 1 year ago

riscv-config restricts the ISA string to be in its fully-elaborated form, no "implications" will be infered (I agree this should be documented in the riscv-config docs and we will raise a PR for the same) - this choice doesn't make it incompatible with the spec. This was primarily chosen by riscv-config because of its major initial use-cases : compliance and test-generation

We don't expect riscv-config to cater to any specific compiler's acceptable ISA format, because those could vary across compilers (and across versions of the same compiler) while all of them still being compatible to the ISA. If someone is using the riscv-config ISA string to derive the "march" string for gcc they will need to do this separately depending on the version of gcc they use (older versions do not support zicsr as a valid string). What I would like to claim is that the riscv-config ISA string will be a superset of the march options at any point.

Regarding documenting the rules that riscv-config uses - we can try to document the rules you see here in to a more english-freidnly list. These rules are derived from across the riscv spec.

Frankly, I would like Chapter-28 to be significantly more restrictive in how the ISA string should be created and give away with flexibility here - this would definitely make compatibility across tools much simpler. But this is a discussion for the riscv-isa-manual repo and not here.

marcfedorow commented 1 year ago

this choice doesn't make it incompatible with the spec

That's not true for this edition of spec. Requiring Zicsr for F makes this strictly non-backward compatible.

because those could vary

They still comply with RISC-V ISA spec. Also, we should agree that most of strings that are accepted by GCC and/or riscv-isa-sim (spike) must be accepted by any other RISC-V ISA string parser that is assumed to be publicly recognizable.

the rules you see here

Such a set of rules often leads to copy-paste errors. E.g. Zk-ext error description is flawed exactly because of this. Instead of prohibiting lots of legal strings it is better to follow the RISC-V ISA spec to enable what is implied so check only prohibited combinations (e.g. rv32imafzfinx). P-ext will make this logic even more complicated. Supporting this for the V-ext is just an overkill.

I would like Chapter-28 to be significantly more restrictive

This is exactly not what people do want. See this. Restricting an already over-complicated logic will just make this impossible both to use and to support.

marcfedorow commented 1 year ago

Finally it could be nice if the legal ISA string would not cause an error "fix this" but emit another legal ISA string which is accepted by riscof.

neelgala commented 1 year ago

That's not true for this edition of spec. Requiring Zicsr for F makes this strictly non-backward compatible.

This is probably a very specific issue you have pointed out - where the checks should be enabled based on the priv and unpriv versions. I would suggest filing this as a separate issue on riscv-config. If you could indicate which version the dependency was introduced.. that would be great too.

Also, we should agree that most of strings that are accepted by GCC and/or riscv-isa-sim (spike) must be accepted by any other RISC-V ISA string parser that is assumed to be publicly recognizable.

I am going to have to disagree here - purely because the use-cases are completely different. And moreover, that list doesn't stop there - you have sail, llvm, etc who can join the club as well.

Such a set of rules often leads to copy-paste errors. E.g. Zk-ext error description is flawed exactly because of this. Instead of prohibiting lots of legal strings it is better to follow the RISC-V ISA spec to enable what is implied so check only prohibited combinations (e.g. rv32imafzfinx). P-ext will make this logic even more complicated. Supporting this for the V-ext is just an overkill.

While I like your idea - it has major implications on how the riscv-arch-tests have been structured and thus i will need more time before I comment further on the feasibility of this. However, I would encourage you to submit a PR on riscv-config on how this could be done - it might help us reach a decision faster.

This is exactly not what people do want. See this. Restricting an already over-complicated logic will just make this impossible both to use and to support.

While this thread itself presents varied opinions, I don't think it applies to riscv-config because I don't expect people to change YAMLs as often as they compile code. The restrictions imposed by riscv-config is not something you face on a daily basis. It should be done once for a riscv instance.

neelgala commented 1 year ago

Finally it could be nice if the legal ISA string would not cause an error "fix this" but emit another legal ISA string which is accepted by riscof.

That sounds a good suggestion - would you mind filing it as a separate issue on riscv-config ?