Closed superenginegit closed 9 months ago
There are a few things that make it a little trickier than it might seem, as there are some mechanisms, like register allocation, that can only reason about the ABI names. It would probably be a good idea to do change that logic to reason on x0
etc as that would help us avoid the confusion wrt fp
and s0
both being the same as x8
. I'd be happy to move to xk
being the source of truth regarding identity, as long as we would still be able to pretty-print the ABI name in the IR, as that helps us reason about the logic. We would probably want to let the user specify what register should be printed in the assembly in case of ambiguity.
Thinking about it a bit more, I don't know of a good way to get roundtrippung to work wrt printing and parsing. We don't want to print the indices in the IR, as that would add a lot of noise, but without that I'm not sure how to recover the ABI name. What would you like this for? Is it for inspecting the IR when printed, or for reasoning about iR? Because there might be different solutions depending on the use-case you have in mind.
The main use-case I had in mind was actually being able to have for example x0
instead of zero
in my assembly code, as I sometimes write x0
etc. and when I go to run the code and it doesn't work it takes some time to change it.
I definitely think that changing the logic to work with the register number would be the way to go. For printing I actually prefer the ABI names almost always, maybe a command-line flag could be useful to select whether to print the name or the number, or maybe, as an optional parameter for reg.dump and mem.dump to select what format to use.
Thanks for showing interest and offering to contribute @superenginegit!
I'm not 100% sure on what's the best way quite yet, but maybe a preprocessor pass during parsing to canonicalize register names could be an option?
I would say that a preprocessor pass would be a good way to go about this. I can't really think of another way that would work as well.
I think it would be really interesting to see what happens when the register is represented with a more complex structure that more accurately identifies it than just the string that we use currently. There are a few restrictions:
j0
, j1
, ... for extra registers)Maybe something like this could work?
@dataclass(frozen=True)
class RISCVRegister:
index: int
name_hint: str = field(default="", hash=False)
class RISCVRegisterType(Data[RISCVRegister], TypeAttribute, ABC):
...
@irdl_attr_definition
class IntRegisterType(RISCVRegisterType):
name = "riscv.reg"
x0 = IntRegisterType(0)
zero = IntRegisterType(0, "zero")
assert x0 == zero
fp = IntRegisterType(8, "fp")
s0 = IntRegisterType(8, "s0")
j0 = IntRegisterType(-1, "j0")
f0 = FloatRegisterType(0)
ft0 = FloatRegisterType(0, "ft0")
If you have the time, could you try this out, and see what happens? Hopefully it would still preserve most of our logic for canonicalization and register allocation, while allowing for writing "x0" when that's more convenient.
Sure, that looks great, I'll have time to test it out in the next few days.
@superenginegit, excuse my colleague, he thought this was an issue on another project of ours. I guess a good place to put the canonicalization would be around line 30 here: https://github.com/AntonLydike/riscemu/blob/f9e92d626e2f1ea8f3ef6deb49a95c31765984eb/riscemu/parser.py#L23-L33
And I probably wouldn't bother with creating a custom class for representing register names.
Thanks, I'll get started then.
Currently it is only possible to use a register by it's ABI-name. I think it would be nice, to be able to also use a register by it's number instead of it's name.
This is something I can implement and create a PR for it. But I wanted to ask if this is something you would be interested in having in this project before putting work into it.