llvm / circt

Circuit IR Compilers and Tools
https://circt.org
Other
1.62k stars 283 forks source link

[RTL] Word type #401

Closed teqdruid closed 3 years ago

teqdruid commented 3 years ago

Currently we are treating i8 as a byte and i1 as a bit but they are semantically different. I proposed we create a Word type. It would be parameterized by width, but said width could be restricted to powers of 2. Users would not be able to do things like arithmetic on words, only cast, slice'n'dice arrays of words, reshape arrays of words, etc. It would basically be for untyped data (which is common and necessary in hardware).

youngar commented 3 years ago

Is there anything that can be done with a Word that can't be done with an i8? Does it change the semantics of any existing operations?

lattner commented 3 years ago

Yeah, I'm not sure what the use case is here. MLIR has an Index type which sounds similar in some ways, but it adds a lot of complexity to add types that have unknown widths. You can't even constant fold.

-Chris

teqdruid commented 3 years ago

Just to clarify, the semantic difference between IntegerType and the WordType I propose is the a 'Word' is untyped. The only thing we know about it is the width. This is common in hardware whenever one is communicating with the outside world -- you're designing an IP component which hangs off an untyped NoC, bits coming in over PCIe, ethernet, etc.

Is there anything that can be done with a Word that can't be done with an i8?

We could restrict casts as being to/from words and arrays of words only. More importantly, we'd restrict words to not allow arithmetic operations.

Does it change the semantics of any existing operations?

Not that I can think of.

it adds a lot of complexity to add types that have unknown widths

Word bitwidth would be known but parameterizable width -- 1 bit, 8 bit, 16, bit, etc.

Yeah, I'm not sure what the use case is here.

You've got an untyped blob coming in and you need to de-serialize it. Or the opposite direction. The untyped blob would be an array of words.

This came up in a discussion in https://github.com/llvm/circt/pull/400

lattner commented 3 years ago

At the RTL dialect level, we don't have notions of serializing etc. We just have logic operations on untyped values. Unless I'm missing something, this seems like a problem for a higher level domain dialect.

teqdruid commented 3 years ago

I was unaware that the RTL dialect was purposely untyped.

lattner commented 3 years ago

This is certainly up for debate as we discussed this morning. My hope was to keep the RTL dialect to be the "unobjectionable" common core that lots of things can be built on top of. This inherently means it doesn't have a lot of interesting stuff, instead pushing it to a higher level abstractions.

The rationale for doing this is that there are many ways to skin the cat, and non-trivial tradeoffs. The FIRRTL world has its own approach to these things, I suspect CoreIR does, and ESI can have its own.

Note that I'm not arguing that everything has to be defined redundantly at (e.g.) the firrtl level, I'm arguing that there can be "another thing" to capture these higher level concepts and multiple front ends can choose to use them if they'd like.

seldridge commented 3 years ago

tl;dr: I actually think this type of thought is super important for building a solid front-end language. It's probably not super critical right now for an IR.

As a tangential case study... Chisel has struggled with how to construct the type hierarchy of hardware types.

I do think there is a nice typeclass pattern here for a front-end languages composed of at least arithmetic ops and bitwise ops. (What else would make sense here?0 Capturing this type hierarchy in the IR does seem a little over-engineered (though it could be in some IR very close to some front-end).

If you're interested in this, Chisel (weirdly) provides bitwise operations via inheritance of the Bits parent and arithmetic operations via the Num typeclass. A Chisel library for building DSP stuff called dsptools goes nuts with the typeclass pattern and adds further divisions for ring, equals, etc. Getting these divisions correct seems like a very powerful abstraction.

teqdruid commented 3 years ago

My hope was to keep the RTL dialect to be the "unobjectionable" common core that lots of things can be built on top of.

Ha! You put 5 of us in a room you get 11 opinions out. Even the arithmetic ops are debated.

It's probably not super critical right now for an IR.

I agree -- this is by no means critical.

lattner commented 3 years ago

As a tangential case study... Chisel has struggled with how to construct the type hierarchy of hardware types.

This illustrates my concern: the design issues here are nuanced, much more so than this debate, and the results are less transferrable between frontends.

darthscsi commented 3 years ago

How is an untyped blob represented as an array of words different from an array of iWordSize? The only think I see with a fixed word is it's semantics w.r.t. bit order in a packed type (struct/array) might be specified whereas existing types aren't. But it seems simpler to constrain this on existing types rather than introduce a new type. And iX bit order is already implicitly specified by the current lowering. As to casting, I'll argue that we need robust casting between packed types and standard integer to support verilog in a sane way. Just the fact you can do:

struct foo { i2 a, i4 b} f1, f2, f3;
i2 x;
array<i2x2> y
{f1, f2, f3, x} = {y,y,y,y,y};

implies we will have robust casts at least in the SV dialect.

teqdruid commented 3 years ago

How is an untyped blob represented as an array of words different from an array of iWordSize?

I was under the impression the iWordSize represented an integer rather than untyped data with WordSize bits. There's a semantic difference between the two in strongly typed languages ... integers imply arithmetic makes sense whereas words imply untyped data which has to be casted before doing anything else. In the untyped world, they're not different.

As to casting, I'll argue that we need robust casting between packed types and standard integer to support verilog in a sane way.

I don't think "support verilog in a sane way" is good justification to put anything in the RTL dialect. Yeah, we'll probably need it in the SV dialect.

teqdruid commented 3 years ago

Sounds like this is a candidate for a higher level dialect at some point in the future. I'm going to close this issue since it's not going in the RTL dialect.