riscv / sail-riscv

Sail RISC-V model
https://lists.riscv.org/g/tech-golden-model
Other
439 stars 163 forks source link

How to handle meta-programmed Sail code in the golden model. #133

Open martinberger opened 2 years ago

martinberger commented 2 years ago

Some Sail code isn't hand-written, but generated by other programs. We's started an interesting debate about how to deal with this in the comments on the P Extension PR https://github.com/riscv/sail-riscv/pull/132

I think it should be a separate thread. Please continue the discussion here.

martinberger commented 2 years ago

I think we should minimise the programming languages used in the golden model. All code should be considered as if it was written by an especially diligent human. It's fine to auto-generate Sail. But I don't think we should have the generating programs in the model. Submitted PRs are free to include the generators for inspection.

Opinions?

jrtc27 commented 2 years ago

To the best of my knowledge, all of the current code was hand-written (or possibly with tiny throwaway scripts or editor macros; the line is blurry), with the exception of prelude_mapping.sail which is working around a language limitation and has the Python script used to generate it the monomorphised versions in a comment.

scottj97 commented 2 years ago

For years I have been disappointed with the violations of the DRY principle throughout this code (e.g. CSR addresses). It makes maintenance unnecessarily difficult and error-prone.

If the Sail language does not and will not enable improvement in this area, then I support generation of the Sail code from a DRYer specification. The riscv-config changes (#128) are a step in this direction.

But writing the initial Sail code is not the hard part. Maintaining it over (we hope) decades of industrial use is the far greater concern. If you commit the generated code to the Git repo and discard the generator, you gain the benefits only for the (easier) initial writing of the code, and lose all the benefits for the (more difficult and more expensive) maintenance of that code.

(From #132:)

I think the two main reasons why we use small scripts to generate Sail is not so much to save keystrokes, b/c we don't have that much code anyway. I think it's:

to avoid errors. to feed other targets of our ISA ideas, like compilers, test suites, documentation etc that we build in parallel while developing a RISCV extension.

I agree -- but then what happens when those specs change? If the code generation is not part of this repo, you lose all those benefits after the initial coding. The code in this repo, which will be hand-edited from thereon, is no longer DRY, and therefore is unnecessarily error-prone. The primary purpose of DRY is to reduce the risk and cost of ongoing maintenance.

Even if the generator is maintained elsewhere, it tends to diverge. See for example this Spike PR where I had to do quite a bit of work to fix a (generated, then committed) file that had diverged from the original source/generator.

Incidentally, LLVM uses the TableGen DSL to help structure almost all information about processors.

Yes, and the source files are committed to Git and the generator runs as part of the LLVM build. They do not commit the generated files and then throw out the TableGen source code!

scottj97 commented 2 years ago

I understand that Arm uses Sail extensively. Surely they have many different configurations of processors they must support. What have they done to solve this problem?

jrtc27 commented 2 years ago

For years I have been disappointed with the violations of the DRY principle throughout this code (e.g. CSR addresses). It makes maintenance unnecessarily difficult and error-prone.

I don't see any issues filed by you suggesting improvements. I also don't see what the problem with CSRs is, you add one line to give its name, one line to implement its read behaviour, one line to implement its write behaviour and one line to say it exists (which you could get rid of at the expense of more code elsewhere, so whilst it's not the cleanest conceptually it's probably the one that has the least code). If you wanted to you could introduce an intermediate representation for them, but overall it'd be at least as much code, everything currently there is necessary, and additional indirection to make it look nicer would add extra code to map the encoding to the more abstract representation.

If the Sail language does not and will not enable improvement in this area, then I support generation of the Sail code from a DRYer specification. The riscv-config changes (#128) are a step in this direction.

Improvement for what? Are there concrete language features you can think of?

But writing the initial Sail code is not the hard part. Maintaining it over (we hope) decades of industrial use is the far greater concern. If you commit the generated code to the Git repo and discard the generator, you gain the benefits only for the (easier) initial writing of the code, and lose all the benefits for the (more difficult and more expensive) maintenance of that code.

It depends how exactly you're using a generator.

(From #132:)

I think the two main reasons why we use small scripts to generate Sail is not so much to save keystrokes, b/c we don't have that much code anyway. I think it's: to avoid errors. to feed other targets of our ISA ideas, like compilers, test suites, documentation etc that we build in parallel while developing a RISCV extension.

I agree -- but then what happens when those specs change? If the code generation is not part of this repo, you lose all those benefits after the initial coding. The code in this repo, which will be hand-edited from thereon, is no longer DRY, and therefore is unnecessarily error-prone. The primary purpose of DRY is to reduce the risk and cost of ongoing maintenance.

Even if the generator is maintained elsewhere, it tends to diverge. See for example this Spike PR where I had to do quite a bit of work to fix a (generated, then committed) file that had diverged from the original source/generator.

I agree that, if generated code where the generator and input continue to exist is present in the model, the generator and input must be considered the preferred source form and the Sail not be hand-edited, if committed at all (depends on how clunky the generator is to run during a build, and what its dependencies are). I am not convinced that is the right direction for the model though.

Incidentally, LLVM uses the TableGen DSL to help structure almost all information about processors.

Yes, and the source files are committed to Git and the generator runs as part of the LLVM build. They do not commit the generated files and then throw out the TableGen source code!

Indeed. It's a related but different problem, though, that it solves, since it isn't trying to specify the exact architectural semantics of every instruction, and ultimately is all about generating tables, not an executable formal model written in an imperative language. (Yes, those tables can emit bytecode that gets interpreted for various purposes, but that's still a long way away.)

martinberger commented 2 years ago

makes maintenance unnecessarily difficult and error-prone.

I agree, but there is a cost of abstraction and code-generating programs have their own learning and maintainance cost ... So, as usual, it's about balancing trade-offs.

Yes, and the source files are committed to Git and the generator runs as part of the LLVM build. They do not commit the generated files and then throw out the TableGen source code!

A core difference is that tablegen is an official LLVM metaprogramming tool. There is currently no equivalent in the Sail ecosystem. (Another difference is that RISCV is much smaller than LLVM.) I think we should be conservative here. That said, I do generated some TableGen code using a Python script ... I doubt LLVM wants that script as part of their code base.

martinberger commented 2 years ago

not an executable formal model written in an imperative language.

I'd say the semantics of instructions is an alternative, and executable, (part of a) model of the ISA. This model might not be complete, since one does not need all of a processor's instructions to implement a viable compiler.

allenjbaum commented 2 years ago

Specifically for CSR behavior: there is no precise architectural definition of what most CSRs do when written. Almost all have WARL fields where the definition of what is legal is implementation specific, and the mapping of illegal->legal is also implementation specific.

That is the subject of pull#128 that: integrates riscv-config to sail-riscv . It provides config2sail.py script to generate SAIL code from ISA YAML configuration. There is quite a bit of discussion of the pull, and to some extent of the concept of a program that generates that Sail code. We are not using a generator because we are trying to save repetition. There is no repetition that we're saving, every CSR is unique. It is simply taking a formal static description (which is how I'd describe the YAML) and turning it into an executable description Arguably, that's pretty much the same thing that the sail compiler does: take a "static" description and turns it into an executable.

I'd like to see some description of what would be acceptable to perform this function, and some real examples of using it by the responders to this PR (that would br Peter, Scott, and perhaps Jessica) Also: the YAML file can't be in the repo because it is unique per model (including one for Spike). There isn't a Sail YAML because Sail uses whatever the device being tested uses. The python generator could certainly be in (some) repo, however; is there any reason that it shouldn't be in the Sail repo?

For the purposes of ACTs, we intend to ship Sail and the ACT test suite inside a docker image. When the image is run, it reads the YAML from a user directory, uses it to generate a CSIM model, then runs it for every applicable ACT in the test suite, producing reference signatures and a coverage report in the users directory. That is later compared with running the tests on the user's device, and then comparing them to yield a final tests report.

CSR definitions for RISC-V are rather complex, primarily because of the open-ended definition of WARL. E.g.

The python script in the pull request reads a YAML file that describes the behavior and generates the Sail code to handle the read/write access function on a per CSR basis. If this is a bit opaque - would including the YAML description as a comment be enough to overcome that objection? Would defining variables with meaningful names (CSRname_]fieldname[subfield_bits]) and using those in the Sail code instead of an opaque number be good enough?

There is a recent spec proposal that would allow the value of a dependent field to be unspecified when a dependency field changes (if it isn't a fixed value, typically 0) Architectural tests can take advantage of that by simply not testing and non-fixed value when the dependcy field changes For testing CSRs with non-conditional WARL CSR fields, we may only need to check that the value written is legal. That may be difficult, but the riscv-config schema allows us to describe some legal values, and some mapping behaviors, if we want to check for that. but it's non-trivial. (the easiest to check is if the write is ignored, which is also usually the easiest to implement ) The other strange case is where a a field can hold legal addresses, and some, but not all, illegal addresses, but the illegal address mapping can be strange (e.g. Spike implements flipping the most significant address bit and sign extending or zeroing the rest from the most signicant original bit)

On Sat, Nov 27, 2021 at 4:01 PM Jessica Clarke @.***> wrote:

For years I have been disappointed with the violations of the DRY principle https://en.wikipedia.org/wiki/Don%27t_repeat_yourself throughout this code (e.g. CSR addresses). It makes maintenance unnecessarily difficult and error-prone.

I don't see any issues filed by you suggesting improvements. I also don't see what the problem with CSRs is, you add one line to give its name, one line to implement its read behaviour, one line to implement its write behaviour and one line to say it exists (which you could get rid of at the expense of more code elsewhere, so whilst it's not the cleanest conceptually it's probably the one that has the least code). If you wanted to you could introduce an intermediate representation for them, but overall it'd be at least as much code, everything currently there is necessary, and additional indirection to make it look nicer would add extra code to map the encoding to the more abstract representation.

If the Sail language does not and will not enable improvement in this area, then I support generation of the Sail code from a DRYer specification. The riscv-config changes (#128 https://github.com/riscv/sail-riscv/pull/128) are a step in this direction.

Improvement for what? Are there concrete language features you can think of?

But writing the initial Sail code is not the hard part. Maintaining it over (we hope) decades of industrial use is the far greater concern. If you commit the generated code to the Git repo and discard the generator, you gain the benefits only for the (easier) initial writing of the code, and lose all the benefits for the (more difficult and more expensive) maintenance of that code.

It depends how exactly you're using a generator.

(From #132 https://github.com/riscv/sail-riscv/pull/132:)

I think the two main reasons why we use small scripts to generate Sail is not so much to save keystrokes, b/c we don't have that much code anyway. I think it's: to avoid errors. to feed other targets of our ISA ideas, like compilers, test suites, documentation etc that we build in parallel while developing a RISCV extension.

I agree -- but then what happens when those specs change? If the code generation is not part of this repo, you lose all those benefits after the initial coding. The code in this repo, which will be hand-edited from thereon, is no longer DRY, and therefore is unnecessarily error-prone. The primary purpose of DRY is to reduce the risk and cost of ongoing maintenance.

Even if the generator is maintained elsewhere, it tends to diverge. See for example this Spike PR https://github.com/riscv-software-src/riscv-isa-sim/pull/848 where I had to do quite a bit of work to fix a (generated, then committed) file that had diverged from the original source/generator.

I agree that, if generated code where the generator and input continue to exist is present in the model, the generator and input must be considered the preferred source form and the Sail not be hand-edited, if committed at all (depends on how clunky the generator is to run during a build, and what its dependencies are). I am not convinced that is the right direction for the model though.

Incidentally, LLVM uses the TableGen DSL to help structure almost all information about processors.

Yes, and the source files https://github.com/llvm/llvm-project/blob/671f0930fe29913011df4eacc89ca19a78e8dbb0/llvm/lib/Target/X86/X86.td are committed to Git and the generator runs as part of the LLVM build. They do not commit the generated files and then throw out the TableGen source code!

Indeed. It's a related but different problem, though, that it solves, since it isn't trying to specify the exact architectural semantics of every instruction, and ultimately is all about generating tables, not an executable formal model written in an imperative language. (Yes, those tables can emit bytecode that gets interpreted for various purposes, but that's still a long way away.)

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/riscv/sail-riscv/issues/133#issuecomment-980809146, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHPXVJSBLIKBXSF3KN4Y6WDUOFWORANCNFSM5I4QHBSQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

PeterSewell commented 2 years ago

In the other thread, Martin wrote:

"No, we can generate the code in anyway we like.

This repo being the official specification is more a social convention than anything else, for focussing our attention and giving the wider community interested in RISCV a single point of reference. As a programming language, Sail has (deliberately) only weak abstraction mechanisms (e.g. no higher-order functions, no user-defined parametric polymorphism), which means certain repetitive patterns of Sail code cannot be programatically expressed in Sail itself. In such cases it makes a great deal of sense to autogenerate Sail using other means. I wholeheartedly recommend it."

One loses a couple of things by relying on autogenerated Sail code:

Peter

On Sat, 27 Nov 2021 at 19:41, Martin Berger @.***> wrote:

Some Sail code isn't hand-written, but generated by other programs. We's started an interesting debate about how to deal with this in the comments on the P Extension PR #132 https://github.com/riscv/sail-riscv/pull/132

I think it should be a separate thread. Please continue the discussion here.

— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/riscv/sail-riscv/issues/133, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFMZZRUYTKRSHJBONOVPJTUOEXWVANCNFSM5I4QHBSQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

jrtc27 commented 2 years ago

Specifically for CSR behavior: there is no precise architectural definition of what most CSRs do when written. Almost all have WARL fields where the definition of what is legal is implementation specific, and the mapping of illegal->legal is also implementation specific.

Not really. A lot of the implementation specificity comes from "which features do you implement, so it's not "I choose to only make these CSR values legal", it's "I implement these extensions with these parameters as defined by the ISA and thus the legal values for this CSR are known". The latter should be written in Sail as part of the architectural model, keyed on the higher-level configurability about what features you have, with only the former being arbitrarily defined. Otherwise you end up with the model not expressing how the architecture actually is defined by the prose "spec", and provide far more flexibility than is legal.

martinberger commented 2 years ago

@PeterSewell I agree that there are many problems with allowing to add generators to the model, and I dread having to look at 17 different generator languages .... I think we can agree that the conservative position, which we should stick to by default, is this:

  1. The model directory must be free from generators, and contains only Sail. The Sail in model should contain the full RISCV semantics. All PRs for model are reviewed w.r.t how they affect model only. It must be possible to understand the full RISCV spec from just reading the code in model, and be able to compile from model to RTL and get a compliant processor.

We could simply stick with (1) and refuse to accept generators. I think a compromise might be possible, which is to keep (1), and to add (for now experimentally) a generated_model directory (similar to prover_snapshots) which contains generators (I'm skimming over a lot of details). For prover_snapshots we impose the additional invariant:

  1. When we run the generators in prover_snapshots, the generated Sail code must be bit-for-bit identical with the Sail in (the relevant git commit in) model. The responsibility for keeping the generators up-to-date is not with the golden-model maintainers.

A PR for prover_snapshots is only accepted if it maintains those invariants. PRs for model are accepted independent from what's going on in prover_snapshots.

If we insist on (1, 2) together, formal verification can continue to work as it does today, with the code in model. I imagine that the core problem with the above is that the maintainers of the generators will run out of energy in keeping up with the code in model.

PeterSewell commented 2 years ago

On Sun, 28 Nov 2021 at 17:20, Martin Berger @.***> wrote:

@PeterSewell https://github.com/PeterSewell I agree that there are many problems with allowing to add generators to the model. I think we can agree that the conservative position, which we should stick to by default, is this:

  1. The ./model directory must be free from generators, contains only Sail. The Sail in model should contain the full RISCV semantics.

AIUI, the basic point is that the proposed mechanism for configuration doesn't satisfy the latter. Perhaps it would be worth looking at a sample of just one or two CSRs (or other configuration options) and consider how they'd look in generator-style and how they'd look in explicitly-paramerised-sail style. I don't have either in my head, though I think the latter is how Robert Norton-Wright set up the earlier configuration machinery.

Peter

  1. All PRs are reviewed w.r.t how they affect model. It must be possible to understand the full RISCV spec from just reading the code in model.

I think a compromise might be possible, which is to keep (1), and to add (for now experimentally) a generated_model directory (similar to prover_snapshots) which contains generators (I'm skimming over a lot of details). For prover_snapshots we impose the additional invariant:

  1. When we run the generators in prover_snapshots, the generated Sail code must be bit-for-bit identical with the Sail in (the relevant git commit in) model. The responsibility for keeping the generators up-to-date is not with the golden-model maintainers.

A PR for prover_snapshots is only accepted if it maintains those invariants. PRs for model are accepted independent from what's going on in prover_snapshots.

If we insist on (1, 2) together, formal verification can continue to work as it does today, with the code in model. I imagine that the core problem with the above is that the maintainers of the generators will run out of energy in keeping up with the code in model.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/riscv/sail-riscv/issues/133#issuecomment-981120689, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFMZZQCHMNKEQX2X2K2NTDUOJQADANCNFSM5I4QHBSQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

martinberger commented 2 years ago

@PeterSewell @allenjbaum CSR and WARL are an interesting edge case.

CSRs and WARL are implementation defined, so it's not 100% clear this should be in the ISA model. One could arguye that implementation-defined stuff has no place in the golden model. Instead, I think it should be derived (mechanically) from the ISA model. We could simply say that it is reasonable to assume that the test writers keep up with the ISA golden model, and how they do it is their problem.

However, currently, this makes it difficult for them, because the widely used scripting languages like Python, don't have canonical importers for Sail, so it's not easy to ingest the golden model and transform it to a required shape. In contrast, it's much easier simply to output Sail from some small parameter set.

An additional difficulty is that the current ISA spec requires WARL to be deterministic [1] which is not a property of single runs, although it can be defined logically by quantification over multiple runs. (I've raised a question about this in the Sail repo [2])

  1. RISC-V ISA Specification, Volume 2, Privileged. Section 2.3 CSR Field Specifications. https://riscv.org/technical/specifications

  2. https://github.com/rems-project/sail/issues/161

PeterSewell commented 2 years ago

On Mon, 29 Nov 2021 at 14:40, Martin Berger @.***> wrote:

@PeterSewell https://github.com/PeterSewell @allenjbaum https://github.com/allenjbaum CSR and WARL are an interesting edge case.

CSRs and WARL are implementation defined, so it's not 100% clear this should be in the ISA model. One could arguye that implementation-defined stuff has no place in the golden model. Instead, I think it should be derived (mechanically) from the ISA model.

neither - the ISA model should be paramaterised on those impl-defined choices

We could simply say that it is reasonable to assume that the test writers keep up with the ISA golden model, and how they do it is their problem.

However, currently, this makes it difficult for them, because the widely used scripting languages like Python, don't have canonical importers for Sail, so it's not easy to ingest the golden model and transform it to a required shape. In contrast, it's much easier simply to output Sail from some small parameter set.

An additional difficulty is that the current ISA spec requires WARL to be deterministic [1] which is not a property of single runs, although it can be defined logically by quantification over multiple runs. (I've raised a question about this in the Sail repo [2])

1.

RISC-V ISA Specification, Volume 2, Privileged. Section 2.3 CSR Field Specifications. https://riscv.org/technical/specifications 2.

rems-project/sail#161 https://github.com/rems-project/sail/issues/161

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/riscv/sail-riscv/issues/133#issuecomment-981696811, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFMZZW5YTMDD4PWWWT5HKDUOOGHDANCNFSM5I4QHBSQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

martinberger commented 2 years ago

the ISA model should be paramaterised on those impl-defined choices

That should be doable, for example by linking in the choices in the C model, or by synthesising Sail functions that encode the choices and then compiling to C.

PeterSewell commented 2 years ago

On Mon, 29 Nov 2021 at 14:55, Martin Berger @.***> wrote:

the ISA model should be paramaterised on those impl-defined choices

That should be doable, for example by linking in the choices in the C model, or by synthesising Sail functions that encode the choices and then compiling to C.

w.r.t. Robert Norton-Wright's earlier setup, he writes:

"In my version the riscv-config processed yaml was converted to a value of a sail datatype that was then included in and used by the model to make the implementation defined behaviours concrete at compile time. "

So you had to rebuild the C or OCaml emulator with that to run it, but you could generate a single parameterised prover definition that you could instantiate to a specific set of implementation choices, but you didn't have to - you could work with the generic definition"

OTOH, w.r.t. the proposed definition, Jess Clarke writes:

"The proposed PR generates all of the CSR code (define the variables and implement isCSRDefined/readCSR/write/CSR), outside of CSRs that are too complicated for their generator. This includes architecturally-mandated behaviour, including things like the legalisation of satp when you try to enable an unsupported translation mode (which is that the whole write takes no effect), or the interaction between PMP registers (locking and TOR). For xEPC they also have a two stage process where they use the generated code but then have a second match to special case the result and further legalise it."

That sounds rather opaque, and not what you'd want to include in the privileged spec document.

Peter

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/riscv/sail-riscv/issues/133#issuecomment-981710144, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFMZZSFOS6M7S3D4DAJGV3UOOH6JANCNFSM5I4QHBSQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

rmn30 commented 2 years ago

So you had to rebuild the C or OCaml emulator with that to run it, but you could generate a single parameterised prover definition that you could instantiate to a specific set of implementation choices, but you didn't have to - you could work with the generic definition.

Just to clarify the 'generic definition' was actually just a default set of choices that lived in the repo. Although sail does have some support for 'undef' I didn't attempt to use it to describe implementation defined behaviour.

PeterSewell commented 2 years ago

On Mon, 29 Nov 2021 at 16:31, Robert Norton @.***> wrote:

So you had to rebuild the C or OCaml emulator with that to run it, but you could generate a single parameterised prover definition that you could instantiate to a specific set of implementation choices, but you didn't have to - you could work with the generic definition.

Just to clarify the 'generic definition' was actually just a default set of choices that lived in the repo.

I should have been clearer - the "generic definition" I meant is a theorem-prover function from the choices to the behaviour - really a generic definition, which (even if you didn't define back then), given your approach, one could define easily enough. That's quite different from that default set of choices.

Although sail does have some support for 'undef' I didn't attempt to use it to describe implementation defined behaviour.

Indeed - that wasn't what I was suggesting, and it wouldn't be good :-)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/riscv/sail-riscv/issues/133#issuecomment-981802490, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFMZZUHZGFXQP3ZQRLDIK3UOOTFPANCNFSM5I4QHBSQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

allenjbaum commented 2 years ago

I completely disagree. The allowed architectural choices likely outnumber the number of atoms that we believe exist in the universe. I wish that were not an exaggeration. I would also disagree that " a lot of implementation specificity comes from which features do you implement" That specificity is in the minority.

Simple example: a CSR is listed as WARL. That means that any particular bit could be RdOnly0, RdOnly1, or RW. That's 3^64 possible combinations - and there are many, many more, just for one CSR. Granted, only a tiny fraction are likely to be useful, but the architecture does not attempt to define what it thinks is useful. If we restrict the Sail implementation to those, then we are architecturally (re)defining all the legal values, a major spec change.

The examples that you mention - selecting which extensions exist, and optional features exist, are simple binary choices. Those aren't the problem, and I expect those can be parameterized as you mention. But the real problem is the WARL nature of fields and CSRs - which are pervasive, and is far more complicated.

We have a YAML syntax for the likely ways we expect designers will express that optionality as designer choices. Many of those choices aren't binary, e.g. a mask of which bits are RW, a lists of value ranges that are legal for designer defined subsets of bits (So the number of ranges, range limits, the start/end positions of the subfield are all user defined, and which other state bits those choices are dependent on.) That's a useful fraction of what we believe designers are likely to specify, and we had to special case some that are already in use. That doesn't even include specifying what the mapping is from an illegal value to one of the legal values that the above specifies.

If the code that is generated is Sail code - why do you need to know another language (besides YAML)? Do you expect designers to actually write the Sail code for every CSR ?

I don't expect designers to know Sail at all; I expect they will take the YAML that describes the architectural features and options they are implementing (including the CSR behaviors), the test suite, push a button, and it will magically spit out a reference signature. Those don't need to know , nor are they likely to want to know, that there is a Sail model under that hood. Certainly the extension authors will need to know Sail, but not the consumers - yet it is those designers who are authoring that YAML.

So, how would you propose that we use that YAML information to coerce Sail to perform this correctly? And, given that those definitions are not architecturally defined or restricted in any useful way, but user specified - how do you expect to prove anything about them anyway?

On Sun, Nov 28, 2021 at 8:22 AM Jessica Clarke @.***> wrote:

Specifically for CSR behavior: there is no precise architectural definition of what most CSRs do when written. Almost all have WARL fields where the definition of what is legal is implementation specific, and the mapping of illegal->legal is also implementation specific.

Not really. A lot of the implementation specificity comes from "which features do you implement, so it's not "I chose to only make these CSR values legal", it's "I implement these extensions with these parameters as defined by the ISA and thus the legal values for this CSR are known". The latter should be written in Sail as part of the architectural model, keyed on the higher-level configurability about what features you have, with only the former being arbitrarily defined. Otherwise you end up with the model not expressing how the architecture actually is defined by the prose "spec", and provide far more flexibility than is legal.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/riscv/sail-riscv/issues/133#issuecomment-981112670, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHPXVJTJS46LDTDQIQWXPY3UOJJMZANCNFSM5I4QHBSQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

scottj97 commented 2 years ago

I also don't see what the problem with CSRs is, you add one line to give its name, one line to implement its read behaviour, one line to implement its write behaviour and one line to say it exists (which you could get rid of at the expense of more code elsewhere, so whilst it's not the cleanest conceptually it's probably the one that has the least code).

Which requires each CSR's address to appear 4 times in three separate files. That's not DRY!

Look at pmpaddr: https://github.com/riscv/sail-riscv/blob/085cfcd3fc3be92aaa86b4caabf9d248eae5f83e/model/riscv_insts_zicsr.sail#L201-L216

Could this repetition be reduced with an array and a loop somehow?

Today if an implementation wants to change how many PMP registers are implemented from 16 to 8, there are 4 places in this file and a total of 20 lines that need to be edited. That should be one integer parameter somewhere.

Improvement for what? Are there concrete language features you can think of?

Reducing repetition and necessary copypasta. I'm no language designer and I don't know Sail well enough to make intelligent recommendations, but the above examples illustrate the issues.

I agree that, if generated code where the generator and input continue to exist is present in the model, the generator and input must be considered the preferred source form and the Sail not be hand-edited, if committed at all (depends on how clunky the generator is to run during a build, and what its dependencies are). I am not convinced that is the right direction for the model though.

I'm not convinced either.

jrtc27 commented 2 years ago

I also don't see what the problem with CSRs is, you add one line to give its name, one line to implement its read behaviour, one line to implement its write behaviour and one line to say it exists (which you could get rid of at the expense of more code elsewhere, so whilst it's not the cleanest conceptually it's probably the one that has the least code).

Which requires each CSR's address to appear 4 times in three separate files. That's not DRY!

As opposed to what? Defining a named constant that gets repeated 4 times? Not really any different. DRY is not the be all and end all, sometimes a bit of repetition is better than taking it to the extreme. Whether or not that applies here is a different matter.

Look at pmpaddr:

https://github.com/riscv/sail-riscv/blob/085cfcd3fc3be92aaa86b4caabf9d248eae5f83e/model/riscv_insts_zicsr.sail#L201-L216

Could this repetition be reduced with an array and a loop somehow?

That's just an example of bad code, not a reason to tear up the entire model's CSR handling. But yes, the PMP implementation (don't look at riscv_pmp_control, there's more like that) should be improved, but you don't need a generator for that.

Today if an implementation wants to change how many PMP registers are implemented from 16 to 8, there are 4 places in this file and a total of 20 lines that need to be edited. That should be one integer parameter somewhere.

You can leave the read/write/name implementations there, you just need to change ext_is_CSR_defined.

Improvement for what? Are there concrete language features you can think of?

Reducing repetition and necessary copypasta. I'm no language designer and I don't know Sail well enough to make intelligent recommendations, but the above examples illustrate the issues.

I guess arrays of registers could help; currently you'd have to emulate that by packing the entire array into a single register, which gets awkward for formal reasoning.

I agree that, if generated code where the generator and input continue to exist is present in the model, the generator and input must be considered the preferred source form and the Sail not be hand-edited, if committed at all (depends on how clunky the generator is to run during a build, and what its dependencies are). I am not convinced that is the right direction for the model though.

I'm not convinced either.

scottj97 commented 2 years ago

As opposed to what? Defining a named constant that gets repeated 4 times? Not really any different.

As opposed to defining each CSR's complete behavior in one place instead of 4, as #128 tries to do.

That's just an example of bad code, not a reason to tear up the entire model's CSR handling. But yes, the PMP implementation (don't look at riscv_pmp_control, there's more like that) should be improved, but you don't need a generator for that.

That's great to hear! See, I thought this ugly code was due to limitations of the language, and that's why a generator would be helpful.

You can leave the read/write/name implementations there, you just need to change ext_is_CSR_defined.

I don't think that works, because the PMP spec says if the registers are not implemented then they become read-only 0. At the very least wouldn't you have to change the write? And pmpcfg also changes with this number. But this is a nitpicky distraction. The question is: can we implement this entirely in Sail having a single integer parameter to specify the number of PMP registers? If so, great! No benefit to having this code generated.

jrtc27 commented 2 years ago

As opposed to what? Defining a named constant that gets repeated 4 times? Not really any different.

As opposed to defining each CSR's complete behavior in one place instead of 4, as #128 tries to do.

That's just an example of bad code, not a reason to tear up the entire model's CSR handling. But yes, the PMP implementation (don't look at riscv_pmp_control, there's more like that) should be improved, but you don't need a generator for that.

That's great to hear! See, I thought this ugly code was due to limitations of the language, and that's why a generator would be helpful.

FWIW the other instance of ugly code I'm aware of is the page table walker, which has copy/pasted implementations for Sv39 and Sv48 (and Sv32, though that's at least more different than just "number of levels"). The language does limit how you can write that somewhat (in C++ you'd just shove templates left, right and centre) but there are still ways to unify them (scattered unions and functions should provide a path to making it work, but failing that there's always the preprocessor including a Sail template multiple times with different $defines).

You can leave the read/write/name implementations there, you just need to change ext_is_CSR_defined.

I don't think that works, because the PMP spec says if the registers are not implemented then they become read-only 0. At the very least wouldn't you have to change the write? And pmpcfg also changes with this number.

If you change is_CSR_defined then read and write will give an illegal instruction exception before attempting to do anything else. But, true, for PMP registers it gets more complicated, so that particular approach doesn't quite work, because it's not that the CSR doesn't exist, it's that it exists with a hard-wired value (architecturally; whether or not a 0-bit register really exists or not is a separate philosophical question).

But this is a nitpicky distraction. The question is: can we implement this entirely in Sail having a single integer parameter to specify the number of PMP registers? If so, great! No benefit to having this code generated.

Almost. You need to manually define the 16 register variables, and read and write functions, just like we do for X registers (which is really just emulating arrays of registers, hence why I could see that being a useful language feature), and because I'm a lazy programmer I'd probably write a single shell for loop to spit out that code, that or use a vim macro. But everything else can be suitably abstracted such that you set N somewhere between 0 and 16 and it just works (with some of the defined registers just not being used if N < 16); that could be static or dynamic, both would work.

allenjbaum commented 2 years ago

The real problem here is that "1... It must be possible to understand the full RISCV spec from just reading the code in model." can't be satisfied by looking at generator Sail, because that would be just a single implementation of a CSR that has a nearly infinite number of legal options.

And because...nearly infinite options.... it's hard to come up with a parameterization that isn't really a full blown DSL.

On Sun, Nov 28, 2021 at 9:49 AM Peter Sewell @.***> wrote:

On Sun, 28 Nov 2021 at 17:20, Martin Berger @.***> wrote:

@PeterSewell https://github.com/PeterSewell I agree that there are many problems with allowing to add generators to the model. I think we can agree that the conservative position, which we should stick to by default, is this:

  1. The ./model directory must be free from generators, contains only Sail. The Sail in model should contain the full RISCV semantics.

AIUI, the basic point is that the proposed mechanism for configuration doesn't satisfy the latter. Perhaps it would be worth looking at a sample of just one or two CSRs (or other configuration options) and consider how they'd look in generator-style and how they'd look in explicitly-paramerised-sail style. I don't have either in my head, though I think the latter is how Robert Norton-Wright set up the earlier configuration machinery.

Peter

  1. All PRs are reviewed w.r.t how they affect model. It must be possible to understand the full RISCV spec from just reading the code in model.

I think a compromise might be possible, which is to keep (1), and to add (for now experimentally) a generated_model directory (similar to prover_snapshots) which contains generators (I'm skimming over a lot of details). For prover_snapshots we impose the additional invariant:

  1. When we run the generators in prover_snapshots, the generated Sail code must be bit-for-bit identical with the Sail in (the relevant git commit in) model. The responsibility for keeping the generators up-to-date is not with the golden-model maintainers.

A PR for prover_snapshots is only accepted if it maintains those invariants. PRs for model are accepted independent from what's going on in prover_snapshots.

If we insist on (1, 2) together, formal verification can continue to work as it does today, with the code in model. I imagine that the core problem with the above is that the maintainers of the generators will run out of energy in keeping up with the code in model.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/riscv/sail-riscv/issues/133#issuecomment-981120689, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABFMZZQCHMNKEQX2X2K2NTDUOJQADANCNFSM5I4QHBSQ

. Triage notifications on the go with GitHub Mobile for iOS < https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675

or Android < https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub .

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/riscv/sail-riscv/issues/133#issuecomment-981124805, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHPXVJRBODKC6M3F5OIXQYTUOJTRJANCNFSM5I4QHBSQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

PeterSewell commented 2 years ago

Hi Allen,

The number of options is irrelevant - one can parameterise on many just as well as on one.

The shape of the options could be relevant, e.g. if some options have to be chosen from a space so complex that one want to describe each possible value using a more-or-less general-purpose algorithmic language. If that is the case, then, well, we have such a language to hand...

You seem to be focussing exclusively on the scenario of using the model to test a specific implementation, but that is far from the only purpose of the architecture spec. If programmers want to know what they can depend on from the architecture in general, or from some family of implementations that instantiate (say) option 57 to value "read-only", then telling them to think about all the Sail that some opaque generator would produce when fed any YAML in which 57=read-only is not terribly useful. Better to give them one Sail definition, parameterised on those values, that at one point says "if option57=read-only then ....".

Peter

On Mon, 29 Nov 2021 at 23:55, Allen Baum @.***> wrote:

The real problem here is that "1... It must be possible to understand the full RISCV spec from just reading the code in model." can't be satisfied by looking at generator Sail, because that would be just a single implementation of a CSR that has a nearly infinite number of legal options.

And because...nearly infinite options.... it's hard to come up with a parameterization that isn't really a full blown DSL.

On Sun, Nov 28, 2021 at 9:49 AM Peter Sewell @.***> wrote:

On Sun, 28 Nov 2021 at 17:20, Martin Berger @.***> wrote:

@PeterSewell https://github.com/PeterSewell I agree that there are many problems with allowing to add generators to the model. I think we can agree that the conservative position, which we should stick to by default, is this:

  1. The ./model directory must be free from generators, contains only Sail. The Sail in model should contain the full RISCV semantics.

AIUI, the basic point is that the proposed mechanism for configuration doesn't satisfy the latter. Perhaps it would be worth looking at a sample of just one or two CSRs (or other configuration options) and consider how they'd look in generator-style and how they'd look in explicitly-paramerised-sail style. I don't have either in my head, though I think the latter is how Robert Norton-Wright set up the earlier configuration machinery.

Peter

  1. All PRs are reviewed w.r.t how they affect model. It must be possible to understand the full RISCV spec from just reading the code in model.

I think a compromise might be possible, which is to keep (1), and to add (for now experimentally) a generated_model directory (similar to prover_snapshots) which contains generators (I'm skimming over a lot of details). For prover_snapshots we impose the additional invariant:

  1. When we run the generators in prover_snapshots, the generated Sail code must be bit-for-bit identical with the Sail in (the relevant git commit in) model. The responsibility for keeping the generators up-to-date is not with the golden-model maintainers.

A PR for prover_snapshots is only accepted if it maintains those invariants. PRs for model are accepted independent from what's going on in prover_snapshots.

If we insist on (1, 2) together, formal verification can continue to work as it does today, with the code in model. I imagine that the core problem with the above is that the maintainers of the generators will run out of energy in keeping up with the code in model.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub <https://github.com/riscv/sail-riscv/issues/133#issuecomment-981120689 , or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ABFMZZQCHMNKEQX2X2K2NTDUOJQADANCNFSM5I4QHBSQ

. Triage notifications on the go with GitHub Mobile for iOS <

https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675

or Android <

https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub

.

— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/riscv/sail-riscv/issues/133#issuecomment-981124805, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AHPXVJRBODKC6M3F5OIXQYTUOJTRJANCNFSM5I4QHBSQ

. Triage notifications on the go with GitHub Mobile for iOS < https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675

or Android < https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub .

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/riscv/sail-riscv/issues/133#issuecomment-982139586, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFMZZWSZRLXKPVQVSA7XDTUOQHFTANCNFSM5I4QHBSQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

martinberger commented 2 years ago

"if option57=read-only then ....".

Which can be auto-generated, too.

We have a lot of new extensions coming to be ratified. And we want to improve the testing process. What's the best way forward? Maybe we can try and bullet-point the advantages and disadvantages of the proposed approaches, so they are in one easily accessible place?

PeterSewell commented 2 years ago

On Tue, 30 Nov 2021 at 11:56, Martin Berger @.***> wrote:

"if option57=read-only then ....".

Which can be auto-generated, too.

Sure, one can - but something has to be the actual source; the question is what, and what affordances it offers.

We have a lot of new extensions coming to be ratified. And we want to improve the testing process. What's the best way forward? Maybe we can try and bullet-point the advantages and disadvantages of the proposed approaches, so they are in one easily accessible place?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/riscv/sail-riscv/issues/133#issuecomment-982567136, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFMZZXZFZPVRMZDEKHBTELUOS3YBANCNFSM5I4QHBSQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

martinberger commented 2 years ago

Sure, one can - but something has to be the actual source; the question is what, and what affordances it offers.

I think the actual source is not that important, as long as it generates the correct output. So I think we should be clear what the requirements are. The uncontroversial requirements are:

Things become more tricky with the additional requirements. Like what other backends are we targetting (tests, Spike/ QEMU glue etc)

PeterSewell commented 2 years ago

On Tue, 30 Nov 2021 at 12:43, Martin Berger @.***> wrote:

Sure, one can - but something has to be the actual source; the question is what, and what affordances it offers.

I think the actual source is not that important, as long as it generates the correct output. So I think we should be clear what the requirements are. The uncontroversial requirements are:

  • Generates full Sail that can remain the (human readable) source

Sorry to repeat myself (maybe I'm not clear or this needs a chat, or maybe I'm misunderstanding), but you seem to be glossing over the big difference in practice between generating full Sail separately for each distinct collection of the many implementation choices, and generating a single parameterised definition. If all you want to do is testing specific implementations, it doesn't make too much difference. But if you want to use it as documentation, or do any reasoning or test generation, it does.

  • Full automation, including scriptability from GH actions

Things become more tricky with the additional requirements. Like what other backends are we targetting (tests, Spike/ QEMU glue etc)

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/riscv/sail-riscv/issues/133#issuecomment-982600475, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFMZZRH6EOI3HP4AQ3GDKLUOTBDJANCNFSM5I4QHBSQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

martinberger commented 2 years ago

generating a single parameterised definition.

I mean a single parameterised definition. What I'm not 100% clear about is, what shape this could have.

allenjbaum commented 2 years ago

I think that any parameter that can be described by a Boolean or two (e.g. enable/disabled, or dddRW/Rdonly0/ Rdonly1) can, and should be supported by parameterized Sail code. That will be coming from the YAML description, and how that gets incorporated into Sail needs to be agreed on (e.g. does Sail read and decode YAML, or is there a tool that parses the YAML and produces commandline parameters?)

There are now two possible points of divergence

The common thread here is probably coding style, The coding style document has been started, but probably needs a lot more fleshing out and examples. If there are rules - or examples - in that document that could be pointed to as examples of what to do (or probably what not to do), it will make supporting Sail much more effective.

A meta question: how much to we want do delay deployment of Sail code to meet the coding guidlelines, vs. fixing them later?

On Tue, Nov 30, 2021 at 4:57 AM Martin Berger @.***> wrote:

generating a single parameterised definition.

I mean a single parameterised definition. What I'm not 100% clear about is, what shape this could have.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/riscv/sail-riscv/issues/133#issuecomment-982611330, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHPXVJTOP667BEMKHSJUXN3UOTC3RANCNFSM5I4QHBSQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

martinberger commented 2 years ago

A meta question: how much to we want do delay deployment of Sail code to meet the coding guidlelines, vs. fixing them later?

This is an great question. From a technical POV, we can phase in changes gradually, requiring that new code is (at least in principle) equivalent to old code. I would be interested to hear from industry about their preferences; what's more important:

allenjbaum commented 2 years ago

There are a couple of discussion point that I wanted to comment on that I lost track of. A couple rare:

The shape of the options could be relevant, e.g. if some options have to be chosen from a space so complex that one want to describe each possible value using a more-or-less general-purpose algorithmic language. If that is the case, then, well, we have such a language to hand...

The problem isn't that Sail isn't capable; it must be if we have a generator that produce Sail code from YAML. The problem is that every implementation would need to write the Sail code for their particular WARL definition for every CSR WARL field. We can't control that. We can't even easily determine if it is only handling a CSR. So we don't allow that. We can control the syntax of what we will test for in YAML though, and construct a generator to generate whatever it takes to compile that into a specialized Sail function. The YAML descriptions are reasonably terse and understandable. A generic function that can parse that would be a rather difficult exercise in Sail, and not understandable (by mortals) because it is generic. The generic parameterized function would be huge, while the actions it produces are much smaller. The specialized generated code for those could be understandable if it's written by someone that knows how to write Sail properly. As we've seen, that's a rather select few that I don't want to overload, and I'd like to reduce that load by removing the burden and having a generator do it

You seem to be focussing exclusively on the scenario of using the model to test a specific implementation, but that is far from the only purpose of the architecture ..

Well yes; but I'm nominally in charge for the ACTs, and this is what we need for that. I don't know how we prove something about the architecture if relevant bits of it are user-definable. But if we can prove something with per-CSR-field handwritten custom Sail, we can prove the same thing if its' per-CSR field generated custom Sail. You would still need to re-prove some things for every implementation if the proof depended on that CSR R/W behavior. We do recompile the Sail model for every implementation

Specifically for PMP registers: a loop would require a bitmap of writable registers, among other feature bits (grain size, physical address size, 16 vs. 64 registers, a bitmap (possibly per register) as to which bits are read/write vs readonly0 vs readonly1, or the allowed address range, or arbitrary combinations of both, for both pmpaddr and pmpcfg. In practice, those will (usually - not always) be identical for each entry - but that isn't required by the spec. WARL is pretty ugly to deal with.

PeterSewell commented 2 years ago

On Wed, 22 Dec 2021 at 00:49, Allen Baum @.***> wrote:

There are a couple of discussion point that I wanted to comment on that I lost track of. A couple rare:

The shape of the options could be relevant, e.g. if some options have to be chosen from a space so complex that one want to describe each possible value using a more-or-less general-purpose algorithmic language. If that is the case, then, well, we have such a language to hand...

The problem isn't that Sail isn't capable; it must be if we have a generator that produce Sail code from YAML. The problem is that every implementation would need to write the Sail code for their particular WARL definition for every CSR WARL field. We can't control that. We can't even easily determine if it is only handling a CSR. So we don't allow that. We can control the syntax of what we will test for in YAML though, and construct a generator to generate whatever it takes to compile that into a specialized Sail function. The YAML descriptions are reasonably terse and understandable. A generic function that can parse that would be a rather difficult exercise in Sail, and not understandable (by mortals) because it is generic. The generic parameterized function would be huge, while the actions it produces are much smaller.

I don't think anyone is suggesting writing a YAML parser in Sail.

The specialized generated code for those could be understandable if it's written by someone that knows how to write Sail properly. As we've seen, that's a rather select few that I don't want to overload, and I'd like to reduce that load by removing the burden and having a generator do it

You seem to be focussing exclusively on the scenario of using the model to test a specific implementation, but that is far from the only purpose of the architecture ..

Well yes; but I'm nominally in charge for the ACTs, and this is what we need for that. I don't know how we prove something about the architecture if relevant bits of it are user-definable.

I would hope that the vast majority of these implementation choices are each taken from a small set of options, that can be described by Sail types (enums, in simple cases). In cases where one really needs arbitrary code, the next step up is to allow the implementation to specify a pure Sail function (one that does not itself read or write registers), eg from the old register value and update value to the new value, and have the architecture parameterised on those functions.

But if we can prove something with per-CSR-field handwritten custom Sail, we can prove the same thing if its' per-CSR field generated custom Sail. You would still need to re-prove some things for every implementation if the proof depended on that CSR R/W behavior. We do recompile the Sail model for every implementation

Specifically for PMP registers: a loop would require a bitmap of writable registers, among other feature bits (grain size, physical address size, 16 vs. 64 registers, a bitmap (possibly per register) as to which bits are read/write vs readonly0 vs readonly1, or the allowed address range, or arbitrary combinations of both, for both pmpaddr and pmpcfg. In practice, those will (usually - not always) be identical for each entry - but that isn't required by the spec. WARL is pretty ugly to deal with.

— Reply to this email directly, view it on GitHub https://github.com/riscv/sail-riscv/issues/133#issuecomment-999195073, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFMZZT6QXYOBR5RGJ45YEDUSEN5NANCNFSM5I4QHBSQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

allenjbaum commented 2 years ago

At almost every level I think we agree; I completely agree that simple options, like enum types, etc should be parameterized in Sail. More complicated WARL fields should be defined with custom Sail functions. Where we disagree is who and how those functions are written. We long ago rejected alllowing vendors to write arbitrary Sail code that is effectively #included into the model, and we rejected making them write Sail code at all. They shouldn't have to. They do have to describe their model with YAML, and we have syntax to describe those WARL fields along with everything else; It's a one stop shop, and there's even a GUI app to help fill in the YAML

It's OK for our extension architects to write Sail to implement functionality that they're defining (or to have a development partner do it), but allowing/forcing vendors to write Sail is turning our golden model into a fool's-golden model, and I cannot support that. IF RVI supplies that Sail code - that's OK, we own it, and if that code is generated from the YAML using our script - I think that's transitively OK

On Tue, Dec 21, 2021 at 10:39 PM Peter Sewell @.***> wrote:

On Wed, 22 Dec 2021 at 00:49, Allen Baum @.***> wrote:

There are a couple of discussion point that I wanted to comment on that I lost track of. A couple rare:

The shape of the options could be relevant, e.g. if some options have to be chosen from a space so complex that one want to describe each possible value using a more-or-less general-purpose algorithmic language. If that is the case, then, well, we have such a language to hand...

The problem isn't that Sail isn't capable; it must be if we have a generator that produce Sail code from YAML. The problem is that every implementation would need to write the Sail code for their particular WARL definition for every CSR WARL field. We can't control that. We can't even easily determine if it is only handling a CSR. So we don't allow that. We can control the syntax of what we will test for in YAML though, and construct a generator to generate whatever it takes to compile that into a specialized Sail function. The YAML descriptions are reasonably terse and understandable. A generic function that can parse that would be a rather difficult exercise in Sail, and not understandable (by mortals) because it is generic. The generic parameterized function would be huge, while the actions it produces are much smaller.

I don't think anyone is suggesting writing a YAML parser in Sail.

The specialized generated code for those could be understandable if it's written by someone that knows how to write Sail properly. As we've seen, that's a rather select few that I don't want to overload, and I'd like to reduce that load by removing the burden and having a generator do it

You seem to be focussing exclusively on the scenario of using the model to test a specific implementation, but that is far from the only purpose of the architecture ..

Well yes; but I'm nominally in charge for the ACTs, and this is what we need for that. I don't know how we prove something about the architecture if relevant bits of it are user-definable.

I would hope that the vast majority of these implementation choices are each taken from a small set of options, that can be described by Sail types (enums, in simple cases). In cases where one really needs arbitrary code, the next step up is to allow the implementation to specify a pure Sail function (one that does not itself read or write registers), eg from the old register value and update value to the new value, and have the architecture parameterised on those functions.

But if we can prove something with per-CSR-field handwritten custom Sail, we can prove the same thing if its' per-CSR field generated custom Sail. You would still need to re-prove some things for every implementation if the proof depended on that CSR R/W behavior. We do recompile the Sail model for every implementation

Specifically for PMP registers: a loop would require a bitmap of writable registers, among other feature bits (grain size, physical address size, 16 vs. 64 registers, a bitmap (possibly per register) as to which bits are read/write vs readonly0 vs readonly1, or the allowed address range, or arbitrary combinations of both, for both pmpaddr and pmpcfg. In practice, those will (usually - not always) be identical for each entry - but that isn't required by the spec. WARL is pretty ugly to deal with.

— Reply to this email directly, view it on GitHub https://github.com/riscv/sail-riscv/issues/133#issuecomment-999195073, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABFMZZT6QXYOBR5RGJ45YEDUSEN5NANCNFSM5I4QHBSQ

. Triage notifications on the go with GitHub Mobile for iOS < https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675

or Android < https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub .

You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/riscv/sail-riscv/issues/133#issuecomment-999325140, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHPXVJUUM6OY3IBFAQWXBZTUSFXD5ANCNFSM5I4QHBSQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

PeterSewell commented 2 years ago

On Wed, 22 Dec 2021 at 07:41, Allen Baum @.***> wrote:

At almost every level I think we agree; I completely agree that simple options, like enum types, etc should be parameterized in Sail. More complicated WARL fields should be defined with custom Sail functions. Where we disagree is who and how those functions are written. We long ago rejected alllowing vendors to write arbitrary Sail code that is effectively #included into the model, and we rejected making them write Sail code at all. They shouldn't have to. They do have to describe their model with YAML, and we have syntax to describe those WARL fields along with everything else; It's a one stop shop, and there's even a GUI app to help fill in the YAML

If these are truly arbitrary functions, I don't understand how they are YAML-expressible. Could you give a couple of examples?

It's OK for our extension architects to write Sail to implement functionality that they're defining (or to have a development partner do it), but allowing/forcing vendors to write Sail is turning our golden model into a fool's-golden model, and I cannot support that. IF RVI supplies that Sail code - that's OK, we own it, and if that code is generated from the YAML using our script - I think that's transitively OK

On Tue, Dec 21, 2021 at 10:39 PM Peter Sewell @.***> wrote:

On Wed, 22 Dec 2021 at 00:49, Allen Baum @.***> wrote:

There are a couple of discussion point that I wanted to comment on that I lost track of. A couple rare:

The shape of the options could be relevant, e.g. if some options have to be chosen from a space so complex that one want to describe each possible value using a more-or-less general-purpose algorithmic language. If that is the case, then, well, we have such a language to hand...

The problem isn't that Sail isn't capable; it must be if we have a generator that produce Sail code from YAML. The problem is that every implementation would need to write the Sail code for their particular WARL definition for every CSR WARL field. We can't control that. We can't even easily determine if it is only handling a CSR. So we don't allow that. We can control the syntax of what we will test for in YAML though, and construct a generator to generate whatever it takes to compile that into a specialized Sail function. The YAML descriptions are reasonably terse and understandable. A generic function that can parse that would be a rather difficult exercise in Sail, and not understandable (by mortals) because it is generic. The generic parameterized function would be huge, while the actions it produces are much smaller.

I don't think anyone is suggesting writing a YAML parser in Sail.

The specialized generated code for those could be understandable if it's written by someone that knows how to write Sail properly. As we've seen, that's a rather select few that I don't want to overload, and I'd like to reduce that load by removing the burden and having a generator do it

You seem to be focussing exclusively on the scenario of using the model to test a specific implementation, but that is far from the only purpose of the architecture ..

Well yes; but I'm nominally in charge for the ACTs, and this is what we need for that. I don't know how we prove something about the architecture if relevant bits of it are user-definable.

I would hope that the vast majority of these implementation choices are each taken from a small set of options, that can be described by Sail types (enums, in simple cases). In cases where one really needs arbitrary code, the next step up is to allow the implementation to specify a pure Sail function (one that does not itself read or write registers), eg from the old register value and update value to the new value, and have the architecture parameterised on those functions.

But if we can prove something with per-CSR-field handwritten custom Sail, we can prove the same thing if its' per-CSR field generated custom Sail. You would still need to re-prove some things for every implementation if the proof depended on that CSR R/W behavior. We do recompile the Sail model for every implementation

Specifically for PMP registers: a loop would require a bitmap of writable registers, among other feature bits (grain size, physical address size, 16 vs. 64 registers, a bitmap (possibly per register) as to which bits are read/write vs readonly0 vs readonly1, or the allowed address range, or arbitrary combinations of both, for both pmpaddr and pmpcfg. In practice, those will (usually - not always) be identical for each entry - but that isn't required by the spec. WARL is pretty ugly to deal with.

— Reply to this email directly, view it on GitHub <https://github.com/riscv/sail-riscv/issues/133#issuecomment-999195073 , or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ABFMZZT6QXYOBR5RGJ45YEDUSEN5NANCNFSM5I4QHBSQ

. Triage notifications on the go with GitHub Mobile for iOS <

https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675

or Android <

https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub

.

You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/riscv/sail-riscv/issues/133#issuecomment-999325140, or unsubscribe < https://github.com/notifications/unsubscribe-auth/AHPXVJUUM6OY3IBFAQWXBZTUSFXD5ANCNFSM5I4QHBSQ

. Triage notifications on the go with GitHub Mobile for iOS < https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675

or Android < https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub .

You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/riscv/sail-riscv/issues/133#issuecomment-999352100, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABFMZZWKYPKIUXXBAPON4LTUSF6CXANCNFSM5I4QHBSQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

allenjbaum commented 2 years ago

The syntax can be found here: https://riscv-config.readthedocs.io/en/latest/yaml-specs.html#csr-template but especially here: https://riscv-config.readthedocs.io/en/latest/yaml-specs.html#warl-field-definition . Note that this syntax is far from complete; WARL can be completely arbitrary, but this represents the kinds of definitions we expect.

On Tue, Dec 21, 2021 at 11:55 PM Peter Sewell @.***> wrote:

On Wed, 22 Dec 2021 at 07:41, Allen Baum @.***> wrote:

At almost every level I think we agree; I completely agree that simple options, like enum types, etc should be parameterized in Sail. More complicated WARL fields should be defined with custom Sail functions. Where we disagree is who and how those functions are written. We long ago rejected alllowing vendors to write arbitrary Sail code that is effectively #included into the model, and we rejected making them write Sail code at all. They shouldn't have to. They do have to describe their model with YAML, and we have syntax to describe those WARL fields along with everything else; It's a one stop shop, and there's even a GUI app to help fill in the YAML

If these are truly arbitrary functions, I don't understand how they are YAML-expressible. Could you give a couple of examples?

It's OK for our extension architects to write Sail to implement functionality that they're defining (or to have a development partner do it), but allowing/forcing vendors to write Sail is turning our golden model into a fool's-golden model, and I cannot support that. IF RVI supplies that Sail code - that's OK, we own it, and if that code is generated from the YAML using our script - I think that's transitively OK

On Tue, Dec 21, 2021 at 10:39 PM Peter Sewell @.***> wrote:

On Wed, 22 Dec 2021 at 00:49, Allen Baum @.***> wrote:

There are a couple of discussion point that I wanted to comment on that I lost track of. A couple rare:

The shape of the options could be relevant, e.g. if some options have to be chosen from a space so complex that one want to describe each possible value using a more-or-less general-purpose algorithmic language. If that is the case, then, well, we have such a language to hand...

The problem isn't that Sail isn't capable; it must be if we have a generator that produce Sail code from YAML. The problem is that every implementation would need to write the Sail code for their particular WARL definition for every CSR WARL field. We can't control that. We can't even easily determine if it is only handling a CSR. So we don't allow that. We can control the syntax of what we will test for in YAML though, and construct a generator to generate whatever it takes to compile that into a specialized Sail function. The YAML descriptions are reasonably terse and understandable. A generic function that can parse that would be a rather difficult exercise in Sail, and not understandable (by mortals) because it is generic. The generic parameterized function would be huge, while the actions it produces are much smaller.

I don't think anyone is suggesting writing a YAML parser in Sail.

The specialized generated code for those could be understandable if it's written by someone that knows how to write Sail properly. As we've seen, that's a rather select few that I don't want to overload, and I'd like to reduce that load by removing the burden and having a generator do it

You seem to be focussing exclusively on the scenario of using the model to test a specific implementation, but that is far from the only purpose of the architecture ..

Well yes; but I'm nominally in charge for the ACTs, and this is what we need for that. I don't know how we prove something about the architecture if relevant bits of it are user-definable.

I would hope that the vast majority of these implementation choices are each taken from a small set of options, that can be described by Sail types (enums, in simple cases). In cases where one really needs arbitrary code, the next step up is to allow the implementation to specify a pure Sail function (one that does not itself read or write registers), eg from the old register value and update value to the new value, and have the architecture parameterised on those functions.

But if we can prove something with per-CSR-field handwritten custom Sail, we can prove the same thing if its' per-CSR field generated custom Sail. You would still need to re-prove some things for every implementation if the proof depended on that CSR R/W behavior. We do recompile the Sail model for every implementation

Specifically for PMP registers: a loop would require a bitmap of writable registers, among other feature bits (grain size, physical address size, 16 vs. 64 registers, a bitmap (possibly per register) as to which bits are read/write vs readonly0 vs readonly1, or the allowed address range, or arbitrary combinations of both, for both pmpaddr and pmpcfg. In practice, those will (usually - not always) be identical for each entry - but that isn't required by the spec. WARL is pretty ugly to deal with.

— Reply to this email directly, view it on GitHub < https://github.com/riscv/sail-riscv/issues/133#issuecomment-999195073 , or unsubscribe <

https://github.com/notifications/unsubscribe-auth/ABFMZZT6QXYOBR5RGJ45YEDUSEN5NANCNFSM5I4QHBSQ

. Triage notifications on the go with GitHub Mobile for iOS <

https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675

or Android <

https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub

.

You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub <https://github.com/riscv/sail-riscv/issues/133#issuecomment-999325140 , or unsubscribe <

https://github.com/notifications/unsubscribe-auth/AHPXVJUUM6OY3IBFAQWXBZTUSFXD5ANCNFSM5I4QHBSQ

. Triage notifications on the go with GitHub Mobile for iOS <

https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675

or Android <

https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub

.

You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/riscv/sail-riscv/issues/133#issuecomment-999352100, or unsubscribe < https://github.com/notifications/unsubscribe-auth/ABFMZZWKYPKIUXXBAPON4LTUSF6CXANCNFSM5I4QHBSQ

. Triage notifications on the go with GitHub Mobile for iOS < https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675

or Android < https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub .

You are receiving this because you were mentioned.Message ID: @.***>

— Reply to this email directly, view it on GitHub https://github.com/riscv/sail-riscv/issues/133#issuecomment-999360683, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHPXVJXDQNMFRJXBMWRYQBTUSF77XANCNFSM5I4QHBSQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>

martinberger commented 2 years ago

can be completely arbitrary,

@allenjbaum By "completely arbitrary", you mean Turing-complete (which means having recursion or some equivalent mechanism)? The example you link to

look very specific to me.

allenjbaum commented 2 years ago

Well, I did say that WARL is completely arbitrary, but not that we're supporting all of that. We are supporting a reasonably large subset of completely arbitrary as defined by this syntax. Generally, it is only WARL fields that have issues, but there are a LOT of WARL fields.

This will not support every legal WARL implementation, and we are saying that if someone does implement something outside this syntax, yet still legal (anything is legal....) and they fail our ACTs, then they'll have to get a waiver (which the ACT policy will allow)

In most real-life cases, I agree with Peter that it would be best to use a parameterized function and punt those weird cases to the generator, and those weird cases are usually just the WARL fields (plural) of a CSR (which are often the entire CSR).

The question then becomes: is it reasonable to write a parameterized WARL legalize function? That function has to be able to handle arbitrary (implementation defined) sized fields each with its own legalize definition

(e.g. some number of upper bits read-only with a specific value, some number of next bits within some range, some number of next bits conditionally either read-only with some specific value or RW, and the lowest bits defining the condition with its own set of legal values.)

That's a real example.

If you're telling me that a Sail function that completely implements this syntax as a parameterized function that takes.... a string or list derived from the YAML?.... is reasonable, then I guess we'll have to find someone to implement that generalized legalize_WARL function in Sail rather than having a specific function for each use case.

From the point of view of someone understanding the code - I'm not sure that a generalized function will be more understandable than a specific function. Superficially, a function call with some parameter string might be pretty obvious, until you start digging into the complex function that implements that call. This is opposed to a specialized function that has no parameters but implements the parameterized version. Then it just comes down to how readable the generated code is.

Going up a level: does it matter how readable the generated code for a WARL field is? The vendor model will implement what it does, regardless of the YAML definition. As long as the Sail matches it, then the generated code should be fine. IS this a requirement or a nice to have?

On Wed, Dec 22, 2021 at 6:18 AM Martin Berger @.***> wrote:

can be completely arbitrary,

@allenjbaum https://github.com/allenjbaum By "completely arbitrary", you mean Turing-complete (which means having recursion or some equivalent mechanism)? The example you link to

  • [dependency-vals] -> field-name[index-hi:index-lo] in [legal-values]
  • [dependency-vals] -> field-name[index-hi:index-lo] bitmask [mask, fixedval]

look very specific to me.

— Reply to this email directly, view it on GitHub https://github.com/riscv/sail-riscv/issues/133#issuecomment-999610372, or unsubscribe https://github.com/notifications/unsubscribe-auth/AHPXVJRRFQZRQFZ75ISF5XDUSHM3BANCNFSM5I4QHBSQ . Triage notifications on the go with GitHub Mobile for iOS https://apps.apple.com/app/apple-store/id1477376905?ct=notification-email&mt=8&pt=524675 or Android https://play.google.com/store/apps/details?id=com.github.android&referrer=utm_campaign%3Dnotification-email%26utm_medium%3Demail%26utm_source%3Dgithub.

You are receiving this because you were mentioned.Message ID: @.***>