lkolbly / ripstop

Apache License 2.0
0 stars 0 forks source link

Variables, Arrays, and the Graph #2

Closed lkolbly closed 2 years ago

lkolbly commented 2 years ago

1 sets up a lot of the infrastructure for the language - namely, the ability to compile. This issue flushes out the language itself a bit more thoroughly.

In particular, we want this code:

module hello() -> (bit led) {
   bit[32] counter;
   counter[0] = 0;
   counter[t] = counter[t-1] + 1;
   led[t] = counter[t-1][25];
}

to generate Verilog like:

module hello(
  input clk,
  input rst,
  output led
);

  reg[31:0] counter_d, counter_q;
  reg led_d, led_q;

  assign led = led_q;

  always @(*) begin
    counter_d <= counter_q + 1;
    led_d <= counter_q[25];
  end

  always @(posedge clk) begin
    if (rst) begin
      counter_q <= 32'b0;
      // TODO: Need to determine the appropriate default reset
      //led_q <= 1'b0;
    end else begin
      counter_q <= counter_d;
      led_q <= led_d;
    end
  end

endmodule

Notice how there are separate registers for led_d/q and counter_d/q. We could also write this code:

module hello() -> (bit led) {
   bit[32] counter;
   counter[0] = 0;
   counter[t] = counter[t-1] + 1;
   led[t] = counter[t][25];
}

to generate:

module hello(
  input clk,
  input rst,
  output led
);

  reg[31:0] counter_d, counter_q;

  assign led = counter_q[25];

  always @(*) begin
    counter_d <= counter_q + 1;
  end

  always @(posedge clk) begin
    if (rst) begin
      counter_q <= 32'b0;
    end else begin
      counter_q <= counter_d;
    end
  end

endmodule

In both of these though, a couple common features appear:

These in turn imply a few things about the compiler:

The second property means that this:

module delay2(bit input) -> (bit output) {
   output[t] = input[t-2];
}

should generate something like:

module delay2(
  input clk,
  input rst,
  input input, // If Verilog doesn't like this, that means we need to either make `input` a reserved keyword or munge the variable names. Preferably the latter.
  output output,
);

  reg delay1_d, delay1_q;
  reg delay2_d, delay2_q;

  assign output = delay2_q;

  always @(*) begin
    delay1_d <= input;
    delay2_d <= delay1_q;
  end

  always @(posedge clk) begin
    delay1_q <= delay1_d;
    delay2_q <= delay2_d;
  end

endmodule

It also means that we can build modules which are purely combinatoric:

module invert(bit input) -> (bit output) {
   output[t] = ~input[t];
}

generates:

module invert(
  input clk,
  input rst,
  input input,
  output output
);

  assign output = ~input;

endmodule
lkolbly commented 2 years ago

See also #15 for how this manifests in relation to reset. In particular, consider this code:

module blinky() -> (bit led) {
    led[0] = 0;
    led[t] = ~led[t-1];
}

It should behave like: image which implies that the inversion should be applied eagerly, as soon as it can.

However, there is a potential pitfall if all operations are applied eagerly. In particular, consider this code:

module foo() -> (bits<32> output) {
    bits<16> counter;
    counter[t] = counter[t-1] + 1;
    output[t] = counter[t-1] + 3;
}

If we apply the second add operation late, i.e. combinatorially to the output like this:

reg[15:0] counter_q;
assign counter_d = counter_q + 1;
assign output = counter_q + 3;

always @(posedge clk) begin
    counter_q <= rst ? 0 : counter_d;
end

then we only need 16 bits of flip-flops. However, if we apply the operate eagerly, like this:

reg[15:0] counter_q;
reg[31:0] output_q;
assign counter_d = counter_q + 1;
assign output_d = counter_d + 3;
assign output = output_q;

always @(posedge clk) begin
    counter_q <= rst ? 0 : counter_d;
    output_q <= rst ? 0 : output_d;
end

then we need 32 bits of flip-flops. Double the resources.

Of course, the user is still able to express the more efficient version:

module efficient_foo() -> (bits<32> output) {
    bits<16> counter;
    bits<16> prev_counter;
    counter[t] = counter[t-1] + 1;
    prev_counter[t] = counter[t-1];
    output[t] = prev_counter[t] + 3;
}

Here, the compiler can optimize away the prev_counter[t] = counter[t-1]. But this creates confusion for the user: In order to use fewer flip-flops, they have to use more variables.

But at the same time, these actually don't do the same thing, so we want to be able to express both. Here's a graph of their different behaviours: image

Wavedrom ``` {signal: [ {name: 'clk', wave: 'p.........'}, {name: 'rst', wave: '1..0......'}, {}, ['foo (lazy)', {name: 'counter', wave: '2...222222', data: '0 1 2 3 4 5 6'}, {name: 'output', wave: '2...222222', data: '3 3 4 5 6 7 8'} ], {}, ['foo (eager)', {name: 'counter', wave: '2...222222', data: '0 1 2 3 4 5 6'}, {name: 'output', wave: '2...222222', data: '0 3 4 5 6 7 8'} ], {}, ['efficient_foo', {name: 'counter', wave: '2...222222', data: '0 1 2 3 4 5 6'}, {name: 'prev_counter', wave: '2...222222', data: '0 0 1 2 3 4 5 6'}, {name: 'output', wave: '2...222222', data: '3 3 4 5 6 7 8'} ], ], config: { hscale: 2 }, head:{ tock:-3, every:1 }} ```

Additionally, here I've expressed code which is more optimized when compiled lazily, but you can also imagine code which is more efficient when compiled eagerly. For example:

module bar(bits<32> input) -> (bit output) {
    output[t] = input[t-1][0];
}

This requires 32 flip-flops when compiled lazily, but 1 flip-flop when compiled eagerly.

The only externally observable difference between these cases is the behaviour during reset.

lkolbly commented 2 years ago

Closing, because we can handle assigning variables around. Intricacies around reset can be handled in other issues.