llvm / circt

Circuit IR Compilers and Tools
https://circt.org
Other
1.66k stars 292 forks source link

[Sim] Combine integer formatting ops into one op #7692

Open fabianschuiki opened 1 week ago

fabianschuiki commented 1 week ago

@fzi-hielscher I was wondering what your thoughts are on maybe combining the integer formatting operations sim.fmt.hex, sim.fmt.bin, and sim.fmt.dec into a single sim.fmt.int operation. The Moore dialect wants to lower to these ops, and there are additional formatting options (width, alignment, zero/space padding) that apply to each of these ops. If we made the integer radix a parameter on the op, we might only have to add these formatting options to a single op. WDYT?

fzi-hielscher commented 1 week ago

sim.fmt.hex and sim.fmt.bin are probably similar enough to consider them to be the same operation. Although, for LLVM (actually libc) lowering we have to handle them quite differently, as we can use %x for hex but have to do binary formatting manually (ugh). sim.fmt.dec is the odd one out because it has to deal with signedness. It is also less restrictive on the padding width. For bin and hex it is strictly numBits / log2(radix). For dec I wanted to allow i0 types to be formatted as 0 and also avoid the numerical feat of calculating ceil(log10( 2^numBits - 1 )) precisely, which we would have to do if we were to follow the SV spec to a tee (again, ugh). If we control padding via an argument instead of inferring it, it would of course solve that problem. Well... it would make it the frontend's problem. 😁

So, assuming we find a way to deal with the sign, I would not oppose merging them. I certainly wouldn't miss the litany of isa<sim::FormatBinOp, sim::FormatDecOp, sim::FormatHexOp> . On the other hand, I assume that there would be a lot of code, that has to check the radix parameter right away and branch based on that. Having the ops separate keeps the code for folders and conversion patterns a little tidier. I know custom OpInterfaces are not super popular, but maybe they would give us the best of both worlds here?

Speaking of folding: The reason I specified the formatting ops as precisely as I did was to make sure, that a constant folded to a literal during compilation ends up looking the exactly the same as if it were formatted by the simulator. But going by anecdotal evidence, simulators tend to treat the SV spec more like a recommendation and just do their own thing on occasion (for formatting specifically, but also in general. And again: ugh) . So, I've considered to just give up on that fight and get rid of the constant folders.

fabianschuiki commented 1 week ago

You're right, the folders and code that wants to switch based on the format are quite a bit easier with distinct ops. On the SV frontend side the thing just looked superficially similar, with everything accepting a padding, padding character, and alignment within the padding. But that might also be too SV specific; other frontends would have different formatting needs, and maybe wouldn't want a width or padding at all. (Well I guess you could set it to 0 then.)

You mentioned that %x in libc could do some of the lifting here. That's probably only the case for two-state integers, isn't it? As soon as we are in four-state land we'd probably have to roll our own formatting functions entirely as some kind of simulator runtime library.

fzi-hielscher commented 4 days ago

Yes, that has been my plan. The monstrosity I've created here seems to do a half decent job at formatting arbitrary (two-state) integers: https://github.com/fzi-hielscher/circt/blob/squashed-trigger-stuff/lib/Conversion/PrintsToArcEnvCalls/PrintsToArcEnvCalls.cpp

But I don't want to extend it any further if I can avoid it. It is messy enough already. And I'm afraid emitting everything inline could end up cluttering the hot parts of the text section. The challenge is then in passing more or less arbitrary IR types to predefined functions. I've got a half finished prototype of a runtime library that can do type marshaling, written in Rust (because, you know, Rust! - I might regret that decision.) Ideally that should allow us to decouple the formatting logic from the IR and just throw a couple of pointers to the runtime lib.

fabianschuiki commented 1 day ago

:crab: :heart:

Yeah this feels like it really wants to be a runtime library that is already linked into Arcilator for JIT execution, and which is available as a library to link against for users that generate models with Arcilator. Might be as simple as sim.fmt.dec -> call @arc_fmt_dec, or looking up a formatting callback in the pointer passed to the eval function. Doing this stuff with linker magic feels very dirty, compared to just letting the user tell the model which functions to call back for certain functionality (asserts, prints, formatting, file I/O, etc.).