google / xls

XLS: Accelerated HW Synthesis
http://google.github.io/xls/
Apache License 2.0
1.15k stars 166 forks source link

Expose Single Value Channels in DSLX frontend #1281

Open mczyz-antmicro opened 5 months ago

mczyz-antmicro commented 5 months ago

Background

DSLX channels always infer a (valid,ready) handshake mechanism, based on the send() and recv() functions. Additional control is provided by conditional and non-blocking variants of these functions: send_if, recv_if, recv_if_non_blocking

Problem statement

There are digital designs that either can't be expressed with a (valid,ready) handshake, or become less than suboptimal.

Use case 1: Configuration

Use case 1 is motivated by an example of a bank of Control and Status Registers. CSRs can be read-only for some processes. From hardware point of view, information stored in CSRs should be read by the subordinate processes constantly to guarantee instant response to a configuration change.

Process A provides configuration for process B. In this example, term "configuration" refers to any data structure that is:

HDL

Module A:

always@(posedge clk)
  conifguration_o <= configuration;

Module B:

module_B(
  input configuration_i;
)

assign fsm_next_state = fun(configuration_i);

In module B it is enough to define configuration as input and the data structure becomes available for processing.

DSLX

In DSLX, however, attempts at expressing these hardware constructs would like this:

Process A:

  next(){
    let tok = send(tok, configuration );
}

Process B:

  next(){
  let (tok, configuration, valid) = recv_non_blocking(tok, channel, zero!<configuration_t>());
}

Observations

  1. Note that the specification for this use case is to always communicate the configuration in shortest time frame possible. This effectively means that neither process A nor process B can ever use recv in the blocking form.
  2. Inferring a (valid,ready) handshake for a (send,recv_non_blocking) pair is an overkill and uses additional resources (unless it could be proven that the handshake is removed by optimization in IR or RTL to GDS flows).
  3. Additionally, the configuration must be saved in the state of the process, because otherwise it would only be available in ticks, when configuration changes. This creates further overhead, compared to HDL description.

Use case 2: Interrupts

Use case when (valid,ready) handshake is not only not needed, but presumably incorrect, are interrupt signals.

Typically, the interrupt signal is a single wire signal, which does not rely on a handshake to send information. Any assertion of the signal should be interpreted by the hardware accordingly to the designers intent.

Currently:

proc IrqSource {

    next(){
      let tok = send_if(tok, channel, 1, state.raise_irq);
      // ...
    }

}

proc IrqHandler{

  next(){
    let (tok, irq, handle_irq) = recv(tok, channel);
    // ...
  }
}

Further comments about Single Value Channels

What is great about HDL descriptions is that when we connect 2 modules with ports is that data stored in the data bus is always readily available for both modules to use without any further overhead.

moduleA(
  output data_bus
)
//
moduleB(
  input data_bus
)
//
moduleTop()
data_bus_type data_bus;

xA moduleA(.data_bus(data_bus));
xB moduleB(.data_bus(data_bus));

In order to express this mechanism in DSLX, we would have to create connections, which do not need to use send/recv pairs. This is suitable to describe a static configuration link, that is such that no flow control is associated with the link and changes to it may occur without any constraints. Ultimately, this leads us back to using the SingleValue Channels. Let's take a look at an example in C-frontend:

#include "xls_int.h"
#include "/xls_builtin.h"

class counter {
    private:
    XlsInt<16, false> cnt;
    public:
    counter() : cnt(0) {}

    void count(XlsInt<1,false>& rst,
               XlsInt<1,false>& up,
               XlsInt<1,false>& down) {
        XlsInt<1, false> _rst = rst;
        XlsInt<1, false> _up = up;
        XlsInt<1, false> _down = down;
        if (_rst != 0)
            cnt = 0;
        if (_rst == 0 && _up == 1)
            ++cnt;
        if (_rst == 0 && _down == 1)
            --cnt;
    }

    void get_count(__xls_channel<XlsInt<16, false>>& out, XlsInt<1, false>& read) {
        if (read)
            out.write(cnt);
    }
};

#pragma hls_top
void test(XlsInt<1, false> r,
          XlsInt<1, false> up,
          XlsInt<1, false> down,
          XlsInt<1, false> read,
          __xls_channel<XlsInt<16, false>>& out) {

    static counter c;
    c.count(r, up, down);
    c.get_count(out, read);
}

The resulting verilog module declaration:

module _proc(
  input wire clk,
  input wire rst,
  input wire r,
  input wire up,
  input wire down,
  input wire read,
  input wire out_rdy,
  output wire [15:0] out,
  output wire out_vld
);

Wires r, up, down, read do not have associated valid/ready handshakes and are simply captured into registers:

      __r_reg <= r;
      __up_reg <= up;
      __down_reg <= down;
      __read_reg <= read;

which later affect the state of the module, e.g.:

assign __out_buf = ~(__r_reg | ~__down_reg) ? _ZZ4test6XlsIntILi1ELb0EES0_S0_S0_R13__xls_channelIS_ILi16ELb0EEL17__xls_channel_dir0EEE1cid14id20__2 : sel_1006;

About DSLX overhead

Compared to HDL, the overhead in DSLX is that:

Summary

meheff commented 2 months ago

Great writeup! I agree this is something we need. I think initially we could support this with annotations on a channel indicating that the channel is single-value and just use receives as normal.

A better longer term solution might be a first class construct ("wire" maybe?) which indicates a direct connection to a port. Such a value could be used directly within the proc. I have some ideas about the possible syntax and a related concept of a stateful block at the DSLX level which is basically an always_ff to enable expressing arbitrary RTL in a latency sensitive way.