chipsalliance / chisel

Chisel: A Modern Hardware Design Language
https://www.chisel-lang.org/
Apache License 2.0
3.96k stars 595 forks source link

[RFC] Tagged Unions #927

Open hngenc opened 5 years ago

hngenc commented 5 years ago

This RFC proposes a new syntax for tagged unions, a feature which was promised back in 2014 but which never came to fruition.

What is a Tagged Union?

Tagged unions are variables which can take one of several different types, just like the familiar union in C. However, unlike C unions, they come with a few extra bits, stored in a tag, so that the program knows at runtime which type the union currently holds.

For example, consider the F# tagged union below (copied from this blog post):

type Shape =
    | Rectangle of width : float * length : float
    | Circle of radius : float
    | Triangle of width : float * height : float

After declaring a Shape, we can pattern match using its (implicit) tag like this:

let getShapeSurface shape =
   match shape with
   | Rectangle(w,l) -> w * l
   | Circle(r) -> 2. * r * 3.14159
   | Triangle(w,h) -> w * h / 2.

Scala 2 lacks tagged unions, but the F# behavior above could be replicated with polymorphism:

trait Shape
case class Rectangle (width: Double, length: Double) extends Shape
case class Circle (radius: Double) extends Shape
case class Triangle (width: Double, height: Double) extends Shape

def getShapeSurface(shape: Shape) = shape match {
    case Rectangle(w,l) => w*l
    case Circle(r) => 2 * r * 3.14159
    case Triangle(w,h) => w * h / 2
}

Use Cases

One use-case would be implementing an Option or Maybe type. The Proposed Solution section discusses this use case further:

x := maybe matches {
    case Valid(a) => a
    case _: Invalid => DontCare
}

Here, the maybe variable can take one of two types, Invalid, or Valid[T]. This can make it easier for the user to make sure they're not erroneously using the Valid value when they're not supposed to.

For another use case, suppose that we are creating a pipelined CPU where the ALU operands are stored in pipeline registers. The operands may be register addresses or immediate values. Without tagged unions, we may find it easiest to pass both the register address and the immediate values into the next stage, even though only one of them is needed. With tagged unions, we could pass a single Union[Register, Immediate] value instead, which could save us some bits while making it more convenient to check which type the operand actually is.

Existing Solutions

In addition to the F# example above, there are numerous languages, for both hardware and software, that support tagged unions, as seen below:

C

In C, unions are untagged by default. Programmers must add their own tags explicitly:

struct Shape {
    enum {RECTANGLE, CIRCLE, TRIANGLE} tag;

    union {
        struct { double width, length; } rectangle;
        struct { double radius; } circle;
        struct { double width, height; } triangle;
    } data;
};

C's syntax has the following advantages:

However, the disadvantages are considerable:

Scala 3

Scala 3 has first-class support for union types:

type Shape = Rectangle | Circle | Triangle

def getShapeSurface(shape: Shape) = shape match {
    case Rectangle(w,l) => w*l
    // ...
}

The syntax is easy-to-use and pretty self-explanatory. The main disadvantage is that matching would only be available at elaboration time, and not when when running on hardware.

BlueSpec

Bluespec has had this feature for a while. In this example, we create a tagged union that represents an instruction operand:

 typedef union tagged {
      bit  [4:0] Register;
      bit [21:0] Literal;
      struct {
          bit  [4:0] regAddr;
          bit  [4:0] regIndex;
      } Indexed;
 } InstrOperand;

Afterwards, we can pattern-match on instances of this tagged union, even at runtime:

InstrOperand orand = tagged Indexed { RegAddr:3, RegIndex:4 };

case (orand) matches
    tagged Register r : x = rf[r];
    tagged Literal n : x = n;
    tagged Indexed { ra, ri } : x = mem[rf[ra]+ri];
endcase

This syntax looks pretty ideal to me, and I can't really think of any significant criticisms for it.

Proposed Solution

My proposed solution was affected heavily by implementation concerns. If we had full control over the Scala language, then we could certainly improve upon this API, but I wanted something that we could do without relying too heavily on macros, reflections, or experimental features.

val Register = UInt(5.W)
val Literal = UInt(22.W)
class Indexed extends Bundle {
    val regAddr = UInt(5.W)
    val regIndex = UInt(5.W)
}

val InstrOperand = Union(Register, Literal, new Indexed())
// InstrOperand : Union3[UInt, UInt, Indexed]

Alternatively, if we add a new "∪" operator, we can declare unions like this:

val InstrOperand = Register ∪ Literal ∪ new Indexed()

Pattern-matching would work as follows:

val instr_op = Wire(InstrOperand)
instr_op := indexed // Some instance of Indexed

instr_op matches {
    case ind: Indexed => x := mem(rf(ind.regAddr) + ind.regIndex))
    // ...
    case _ => // default behavior
}

If the user wishes to provide an extractor for their Indexed class, they could even use that:

instr_op matches {
    case Indexed(ra, ri) => x := mem(rf(ra) + ri))
    // ...
    case _ => // default behavior
}

I believe it would also be possible to return values from a matches statement, as below:

x := maybe matches {
    case Valid(a) => a
    case _: Invalid => DontCare
}

The previous example also demonstrates how we could use unions to create something analogous to Scala's Option[T] type.

The proposed API has the following advantages:

However, there are disadvantages as well:

Any comments, concerns, or criticisms?

ducky64 commented 5 years ago

This looks interesting!

Some thoughts:

Also, how are you implementing the match stuff? Especially since all the different paths need to run at elaboration time?

chick commented 5 years ago

LBNL's OpenSoC Fabric: On-Chip Network Generator uses quite a bit of tagged union kind of stuff. @ucbjrl knows a lot more about it than I do. I don't think we necessarily want to start with their work, but I'd recommend taking a look at it for some real world use cases.

edwardcwang commented 5 years ago

What is the hardware generated by Union(Register, Literal, new Indexed())? Does it create an active bit or something for each possibility?

hngenc commented 5 years ago

@ducky64 Thanks!

Regarding Bundle-like syntax: I agree that this would be better, but I couldn't think of a good way to implement matching in that case. Consider this example:

class InstrOperand extends Union {
    val register = UInt(4.W)
    val indexed = new Bundle {
        val regAddr = UInt(5.W)
        val regIndex = UInt(5.W)
    }
}

Later, what would we match against, considering that indexed has an un-named type? The closest I could come up with was something like this:

instr_op matches {
    case 'Register => x := instr_op.register
    case 'Indexed => // ...
}

I also couldn't think of a good way to do unboxing with the above syntax, so that users can operator on the register value directly, instead of typing instr_op.register.

As for how matching would be implemented, I was thinking of looping through all the types the union can take, and typecasting to each of them. It would look something like this:

for (t <- types) {
    when (tag === tagOf(t)) {
        partialFunc(data.asTypeOf(t))
    }
}

where partialFunc is the input to the matches statement, and data is a UInt (or Vec[Bool]) with the width of the largest possible type of the union.

@edwardcwang The union in your example would generate 2 extra bits to describe the 3 different possibilities. One-hot encodings, or other custom encodings, could be added in the future.