tum-ei-eda / seal5

Seal5 - Semi-automated LLVM Support for RISC-V Extensions including Autovectorization
https://tum-ei-eda.github.io/seal5/
Apache License 2.0
12 stars 6 forks source link

Description/configuration of intrinsics #64

Open thomasgoodfellow opened 5 months ago

thomasgoodfellow commented 5 months ago

Revisit the YAML configuration from phase 1

PhilippvK commented 5 months ago

I think we can use the existing YAML Schema for the Extensible Compiler to analyse what needs to be described:

Here is the relevant snippet:

          "intrinsics": {
            "type": "object",
            "patternProperties": {
              "^.*$": {
                "type": "object",
                "properties": {
                  "instruction": {"type": "string"},
                  "returnType": {"type": "string"},
                  "parameters": {
                    "type": "array",
                    "items": {
                      "type": "object",
                      "properties": {
                        "name": {"type": "string"},
                        "type": {"type": "string"},
                        "operand": {"type": "string"}
                      }
                    }
                  }
                },
                "additionalProperties": false
              }
            },
            "additionalProperties": false
          }
PhilippvK commented 5 months ago

The Seal5Settings (see https://github.com/tum-ei-eda/seal5/blob/84ad68cd5c1b646ba0bbd8416a82bb08d39a7921/seal5/settings.py#L517) to not have fields for storing per-instructions metadata because this information would be normally stored in the Metamodel or CoreDSL Syntax of the instructions.

If we want to define the intrinsics on a InstrSet/Extension level ExtensionsSettings (see https://github.com/tum-ei-eda/seal5/blob/84ad68cd5c1b646ba0bbd8416a82bb08d39a7921/seal5/settings.py#L344) would be a suitable parent.

The could then add a transform pass which converts the yaml based intrinsics into suitable metamodel python classes.

thomasgoodfellow commented 5 months ago

I was hoping that the YAML needed can be reduced a little, e.g. by allowing defaults for the parameter types from the meta model? But that would need the knowledge that the instruction arguments are register numbers, which is only apparent in the behavior block when the register array is indexed. So although for simple cases like MAC it feels wordy to list name and type for each it is explicit and unambiguous

PhilippvK commented 5 months ago

@thomasgoodfellow The seal5 metamodel does actually detect is operands are registers or immediates and marks this using attributes:

https://github.com/tum-ei-eda/seal5/blob/84ad68cd5c1b646ba0bbd8416a82bb08d39a7921/seal5/model.py#L117-L124

Having defaults definetely makes sense. Often we only want to specify whether an intrinsic shall be generated and what the name should be. But we should also have the possibility to choose the argument names/types/order.

thomasgoodfellow commented 5 months ago

Thanks - I've been grepping through the code. Do you have any tip for running this under a debugger? - I've been inserting pdb.set_trace() at points but because it's running under a child process the I/O capture makes it awkward. Is there an easy way to bypass the ThreadPoolExecutor and have the passes execute within the same callstack?

PhilippvK commented 5 months ago

@thomasgoodfellow I agree that currently print()-based debugging is the way to go and this is not acceptable in some cases.

Let me introduce some context first:

  1. We are spawning threads (not processes) to execute Seal5s individual passes for multiple models in parallel. If this is problemetic you could restrict the number of used workers to one or alternatively we could add an option to circumvent the ThreadPoolExecutor for debugging.
  2. The actual passes (not all of them) might spawn a subprocess to interface with Seal5s frontends/transforms/backends which currently only expose a command line interface. I guess this is what makes debugging harder.

I already proposed a solution for "problem" 2. a while ago in #54. The main reason for the limitation (CLI) only is that is what was meant to be used in M2-ISA-R but this is about to change. After implementing a consitent API for all Frontend/Transforms/Backend we should be able to configure stuff like log levels, levels or parallelism, use API (CLI or Python) either globally or on a per-pass fashion.

BTW: Executing the transforms,... in a subprocess instead of calling them directly from Python has some advantages:

I have some ideas how this would be implemented but did not start with any coding. If this feature is important for you I could look into it soon. The biggest efforts will probably be migrating the existing set of transforms to the new API, but for testing we could maybe start with a single one. If it turns out to be helpful, you could eventually help me to migrate the remianing transforms.