B-Lang-org / bsc

Bluespec Compiler (BSC)
Other
952 stars 146 forks source link

Bluesim VCD for mkLFIFO has wrong FULL_N value #519

Open quark17 opened 1 year ago

quark17 commented 1 year ago

As reported on the b-lang-discuss mailing list, for the BSV code below, the VCD file generated when simulating with Bluesim has a different value for the FULL_N port of the mkLFIFO module than the VCD from Verilog simulation.

At a clock edge, Bluesim steps through all the rules that are schedule to fire, executing them in order. Thus, Bluesim is able to see the state of the system at all the micro-steps that happen: a FIFO may start full (FULL_N is False), then a rule dequeues from the FIFO (FULL_N becomes True), and then another rule enqueues into the FIFO (FULL_N becomes False again). For someone debugging a design, it could be useful to see all of these values. However, we have chosen that Bluesim VCDs should match the VCDs generated from Verilog simulation. The Verilog is simulating closer to what happens in the hardware: after the clock edge, the signals settle down into one value, which is latched at the next edge. For typical submodules, the values represent the view of the state at the start of the clock cycle; however, for "loopy" modules (like mkLFIFO), the values are the ones seen at some later micro-step. Thus, in the Verilog simulation, FULL_N has the value after the first rule has executed (when FULL_N goes to True) and not the value at the start or end of the execution. Bluesim should generate a VCD file that matches this behavior of Verilog.

This bug is only in the dumping of the VCD; the behavior of the Bluesim simulation is correct. In Bluesim, there is a C++ class for modules, which has functions for executing rules and methods, and separately it has a function to dump the VCD -- a function which is called at the end of the execution of all rules at a clock edge. Here, it's just that dump function which is buggy. And this is because Bluesim shares one C++ class for all FIFO modules. When instantiated, the constructor has arguments that specify which of the many FIFOs that instance should be. One is for "loopy" behavior. In the VCD dump function, there are if-else statements to pick how to dump the VCD depending on the type of FIFO, but it is missing if-statements that check for loopy. We should fix that and also add a test case.

Here is the example that illustrates the difference:

import StmtFSM::*;
import FIFO::*;
(* synthesize *)
module mkTb (Empty);
   FIFO#(Maybe#(UInt#(8))) a1a2 <- mkLFIFO,
   a2a3 <- mkLFIFO;
   Reg#(UInt#(8)) x <- mkReg(23);

   // <A1>
   rule a1_stage;
      a1a2.enq(tagged Valid x);
      x <= x + 1;
   endrule

   // <A2>
   rule a2_stage;
      let a2 = a1a2.first; a1a2.deq;
      a2a3.enq(a2);
   endrule

   // <A3>
   rule a3_stage;
      let a3 = a2a3.first; a2a3.deq;
   endrule

   Stmt main = seq
      delay(10);
      $finish(0);
   endseq;
   mkAutoFSM(main);

endmodule: mkTb
quark17 commented 1 year ago

It is worth mentioning that the function that needs fixing is dump_VCD() in the Bluesim file bs_prim_mod_fifo.h (in src/bluesim/).

Ashwin4514 commented 10 months ago

I would like to take up this issue... Where should I start from?

quark17 commented 10 months ago

Awesome! Thank you!

The file to look at is bs_prim_mod_fifo.h in src/bluesim/. This file defines a single class MOD_Fifo that can behave as any of the primitive FIFO modules. When the class is instantiated, arguments to the constructor tell it what type of FIFO it should be. The most important argument is fifo_type and that is recorded as the class member type, declared at line 426:

const tFifoType    type;

The type is an enum declared at the top of the file:

typedef enum { FIFO_SIMPLE, FIFO_LOOPY, FIFO_BYPASS} tFifoType;

You'll notice that there are a lot of files named bs_prim_mod_*.h. These are defining all the various primitive modules: registers, regfiles, BRAM, etc. In some files, there are separate classes for implementing the different types of that primtive. In this case, for FIFOs, they share one class.

You'll see that the class has member functions METH_deq, METH_enq, METH_notFull, etc. As the naming convention suggests, these implement the methods of the module.

The MOD_Fifo class (just like every module class in those other files) has two member functions for VCD dumping: dump_VCD_defs and dump_VCD. A VCD file starts with a list of all the signals, and then the rest of the file reports when any of those signal values change over time. The list assigns a unique ID to each signal name, so that a change can be reported using the short unique ID (and not the potentially long signal name). So the first function, dump_VCD_defs, is called once at the start to write out the list of signals (and their IDs). Then dump_VCD is called on every cycle, to dump the values of those signals, if they changed.

Inside the dump_VCD function, you won't see the names of signals being written. Instead, there is a variable num which is the unique ID, and you will see that a value is written for num and then num is incremented. This follows the same order as the signals were declared in dump_VCD_defs. So, the first one is CLK, the next one is RST, then FULL_N, and EMPTY_N, and so on. Except that I think the CLK is handled by the caller, so the dump_VCD function starts with RST.

Only changes in value need to be dumped, so there is a struct called backing which stores the last reported values and, for each signal, the function checks if the current value has changed from the backing value, and if so then it writes the change. For example:

if (METH_notFull() != backing.METH_notFull())
  vcd_write_val(sim_hdl, num++, METH_notFull(), 1);
else
  ++num;

if (METH_notEmpty() != backing.METH_notEmpty())
  vcd_write_val(sim_hdl, num++, METH_notEmpty(), 1);
else
  ++num;

You'll notice that this code is calling METH_notFull (a member of the MOD_Fifo class) to get the current value. But the definition of that function uses the current elems value. The member elems is the current number of elements in the FIFO, and is immediately updated when METH_enq and METH_deq are called. But if VCD dumping is done at the end of the rule executions for that clock tick (after all the enqueues and dequeues have happened), then the current value may not be the correct value to display in the VCD anymore, since the Verilog port reflects the value at a certain point in the middle of the schedule.

FYI, to see where the dump_VCD function is being called, look in kernel.cxx. That file defines the Bluesim kernel, which drives the simulation. Bluesim has a fixed schedule of rules that it knows to execute on each tick of the clock. Bluesim has an event queue, containing the next tick of each clock, and it goes through that queue and executes the next clock tick -- executing its schedule, dumping the VCD if necessary, and then putting the next clock tick on the queue. The function for doing that is run_edge_schedule_event in kernel.cxx, and you'll see that after it executes the schedule, it calls setup_vcd_events if VCD dumping is enabled. And sets up a call to vcd_event, which is what calls dump_VCD on each of the modules.

Looking at how dump_VCD for FIFOs calls METH_notFull to get the value for the VCD ... I'm confused how that's the right thing to do even for ordinary FIFOs. If a rule enqueues into the last element of a FIFO, then METH_notFULL would go False, but for ordinary FIFOs, the notFull Verilog port should reflect the value before the enqueue. I suggest starting by making an example design using mkFIFO1, to check whether the VCD dump is even correct for ordinary FIFOs.

It may be that the VCD dumping should be calling METH_i_notFull instead. That's a version of the method which can be executed later in the schedule and still report the value as it appeared earlier in the schedule. It does that by using saved_elems instead of elems -- during the enq/deq/clear methods, which change the value of elems, an older value is potentially "saved" into the saved_elems variable. (The METH_clear code, for example, only needs to save the value if no enqueue or dequeue was executed, that would have already saved the value. The code is able to check for this by looking at the timestamp of the last enq and deq calls, which are stored in deq_at and enq_at, and comparing that to the current time, reported by bk_now -- where bk_ is the prefix for Bluesim kernel functions.)

However, I notice that METH_i_notFull seems to still not be doing the right thing for loopy FIFOs! It needs to record the value at the point where deq was called, but enq has not yet happened. It's possible that the fix is for loopy FIFOs to set saved_elems too (but differently).

Does that help? Maybe this is a little more buggy than I thought, and a little more complicated to fix. I'd start first by making examples of these situations and checking what is working. I'm happy to offer advice as you go.

An eventual PR will also need to add testcases. There's some info on that in the README.md in the testsuite directory (in the section on test suite structure). And the Bluesim tests are in testsuite/bsc.bluesim/ and specifically there's a vcd/ subdirectory, where tests should probably be added.

Ashwin4514 commented 10 months ago

Hey @quark17 thanks for the insights.. Ill try to work with these details and update soon.