calyxir / calyx

Intermediate Language (IL) for Hardware Accelerator Generators
https://calyxir.org
MIT License
458 stars 47 forks source link

Unexpected behavior with reading from `seq_mem` reference and `invoke` #2117

Open polybeandip opened 3 weeks ago

polybeandip commented 3 weeks ago

This issue is about divergent behavior between icarus-verilog, verilator, and the Calyx interpreter.

Here's a toy eDSL program and it's associated data file:

# bug.py
import calyx.builder as cb

def component(prog):
    comp = prog.component("comp")

    A = comp.seq_mem_d1("A", 32, 1, 1, is_ref=True)
    B = comp.seq_mem_d1("B", 32, 1, 1, is_ref=True)

    a = comp.reg(32) 
    zero = comp.const("zero", 1, 0) 

    read_A = comp.mem_load_d1(A, zero.out, a, "read") 
    write_B = comp.mem_store_d1(B, zero.out, a.out, "write") 

    comp.control += [read_A, write_B]

    return comp

def insert_main(prog):
    main = prog.component("main")

    A = main.seq_mem_d1("A", 32, 1, 1, is_external=True)
    B = main.seq_mem_d1("B", 32, 1, 1, is_external=True)

    comp = component(prog)
    comp = main.cell("comp", comp)

    main.control += [cb.invoke(comp, ref_A=A, ref_B=B)]

if __name__ == "__main__":
    prog = cb.Builder()
    insert_main(prog)
    prog.program.emit()
# bug.data
{
  "A": {
    "data": [5],
    "format": {
      "numeric_type": "bitnum",
      "is_signed": false,
      "width": 32
    }
  },
  "B": {
    "data": [0],
    "format": {
      "numeric_type": "bitnum",
      "is_signed": false,
      "width": 32
    }
  }
}

This program should read 5 from memory A and write this value into memory B. However, depending on whether it's run with icarus-verilog, verilator, or the calyx interpreter, we find different results:

Verilator:

am3327@havarti:/scratch/anshuman/calyx$ fud e bug-after-compile-invoke.futil   -s verilog.data calyx-py/test/correctness/bug.data   --to dat --through verilog -v -q
{
  "cycles": 5,
  "memories": {
    "A": [
      5
    ],
    "B": [
      5
    ]
  }
}

Icarus-verilog:

am3327@havarti:/scratch/anshuman/calyx$ fud e bug-after-compile-invoke.futil   -s verilog.data calyx-py/test/correctness/bug.data   --to dat --through icarus-verilog -v -q
{
  "cycles": 5,
  "memories": {
    "A": [
      5
    ],
    "B": [
      0
    ]
  }
}

Calyx interpreter:

am3327@havarti:/scratch/anshuman/calyx$ fud e bug-after-compile-invoke.futil   -s verilog.data calyx-py/test/correctness/bug.data   --to interpreter-out -v -q
{
  "main": {
    "A": [
      5
    ],
    "B": [
      5
    ]
  }
}

(run on Havarti by @anshumanmohan)

verilator and the Calyx interpreter produce the correct result, while icarus-verilog does not.

It appears this behavior is specific to reading from sequential memories passed as references to a component called via invoke. For example, running either of these programs with bug.data causes no issues.

A changed from seq to comb memory:

import calyx.builder as cb

def component(prog):
    comp = prog.component("comp")

    A = comp.comb_mem_d1("A", 32, 1, 1, is_ref=True)
    B = comp.seq_mem_d1("B", 32, 1, 1, is_ref=True)

    a = comp.reg(32) 
    zero = comp.const("zero", 1, 0) 

    read_A = comp.mem_load_d1(A, zero.out, a, "read") 
    write_B = comp.mem_store_d1(B, zero.out, a.out, "write") 

    comp.control += [read_A, write_B]

    return comp

def insert_main(prog):
    main = prog.component("main")

    A = main.comb_mem_d1("A", 32, 1, 1, is_external=True)
    B = main.seq_mem_d1("B", 32, 1, 1, is_external=True)

    comp = component(prog)
    comp = main.cell("comp", comp)

    main.control += [cb.invoke(comp, ref_A=A, ref_B=B)]

if __name__ == "__main__":
    prog = cb.Builder()
    insert_main(prog)
    prog.program.emit()

Memories no longer passed via reference through invoke:

import calyx.builder as cb

def component(prog):
    comp = prog.component("main")

    A = comp.seq_mem_d1("A", 32, 1, 1, is_external=True)
    B = comp.seq_mem_d1("B", 32, 1, 1, is_external=True)

    a = comp.reg(32) 
    zero = comp.const("zero", 1, 0) 

    read_A = comp.mem_load_d1(A, zero.out, a, "read") 
    write_B = comp.mem_store_d1(B, zero.out, a.out, "write") 

    comp.control += [read_A, write_B]

    return comp

if __name__ == "__main__":
    prog = cb.Builder()
    component(prog)
    prog.program.emit()

You can find the files referenced here on this branch:

anshumanmohan commented 3 weeks ago

Thanks @polybeandip, and congrats(? 😅) on finding an interesting issue! A few suggestions to make other readers' lives easier:

anshumanmohan commented 3 weeks ago

Changes looking great, thanks for bearing with my somewhat pedantic request!

sampsyo commented 3 weeks ago

Hey, @polybeandip—REALLY nice work on the bug writeup here; this is a model of how to make a description like this clear and self-contained. 👏

This looks like a real doozy. It's especially delicious that Icarus and Verilator disagree, with is example number 54089454 of how Verilog is under-specified to such a degree that even very popular implementations routinely disagree. I feel like there is a strong chance that this problem exists in our Verilog backend rather than in the compiler lowering per se.

I don't know if it will reveal anything, but one potentially-useful next step here would be to do some test case reduction on the generated Calyx program, or perhaps the one produced after compile-invoke. Maybe we can produce the smallest possible (lowered) Calyx program that elicits this disagreement…