Open Kuree opened 3 years ago
@jack-melchert Running regressions with CW IP. Found couple errors with current lassen. There are several failed tests. My guess is they are all related to the same bug.
p = op(inst=Inst(alu=ALU_t.FP_sub, signed=Signed_t.unsigned, lut=0, cond=Cond_t.Z, rega=Mode_t.BYPASS, data0=0, regb=Mode_... rege=Mode_t.BYPASS, bit1=Bit(False), regf=Mode_t.BYPASS, bit2=Bit(False)), func=<function <lambda> at 0x7f7af442fcb0>)
args = (<class 'hwtypes.fp_vector.FPVector[8,7,RoundingMode.RNE,False]'>(-0.0), <class 'hwtypes.fp_vector.FPVector[8,7,RoundingMode.RNE,False]'>(0.0))
@pytest.mark.parametrize("op", [
op(asm.fp_add(), lambda x, y: x + y),
op(asm.fp_sub(), lambda x, y: x - y),
op(asm.fp_mul(), lambda x, y: x * y)
])
@pytest.mark.parametrize("args",
[(BFloat16.random(), BFloat16.random()) for _ in range(NTESTS)] +
list(product(fp_sign_vec+fp_zero_vec, fp_sign_vec+fp_zero_vec))
)
def test_fp_binary_op(op, args):
inst = op.inst
in0 = args[0]
in1 = args[1]
out = op.func(in0, in1)
data0 = BFloat16.reinterpret_as_bv(in0)
data1 = BFloat16.reinterpret_as_bv(in1)
res, res_p, _ = pe(inst, data0, data1)
assert res == BFloat16.reinterpret_as_bv(out)
if CAD_ENV:
> rtl_tester(op, data0, data1, res=res)
tests/test_pe.py:209:
Here is an easy way to reproduce:
def test_fp_mul():
# Regression test for https://github.com/StanfordAHA/lassen/issues/111
inst = asm.fp_add()
data0 = Data(0x8000)
data1 = Data(0x8000)
res, res_p, _ = pe(inst, data0, data1)
if CAD_ENV:
rtl_tester(inst, data0, data1, res=res)
else:
pytest.skip("Skipping since DW not available")
Thanks for letting me know, I'll take a look.
Here is the diff if you want to switch to CW IP:
diff --git a/tests/rtl_utils.py b/tests/rtl_utils.py
index bdf3a24..9629148 100644
--- a/tests/rtl_utils.py
+++ b/tests/rtl_utils.py
@@ -50,10 +50,10 @@ test_dir = "tests/build"
# the coreir context causing a "redefinition of module" error
magma.backend.coreir_.CoreIRContextSingleton().reset_instance()
magma.compile(f"{test_dir}/WrappedPE", pe_circuit, output="coreir-verilog",
- coreir_libs={"float_DW"})
+ coreir_libs={"float_CW"})
# check if we need to use ncsim + cw IP
-cw_dir = "/cad/synopsys/dc_shell/J-2014.09-SP3/dw/sim_ver/" # noqa
+cw_dir = "/cad/cadence/GENUS_19.10.000_lnx86/share/synth/lib/chipware/sim/verilog/CW/" # noqa
CAD_ENV = shutil.which("ncsim") and os.path.isdir(cw_dir)
@@ -109,13 +109,14 @@ def rtl_tester(test_op, data0=None, data1=None, bit0=None, bit1=None, bit2=None,
tester.circuit.O1.expect(res_p)
if CAD_ENV:
# use ncsim
- libs = ["DW_fp_mult.v", "DW_fp_add.v", "DW_fp_addsub.v"]
+ libs = ["CW_fp_mult.v", "CW_fp_add.v"]
for filename in libs:
copy_file(os.path.join(cw_dir, filename),
os.path.join(test_dir, filename))
tester.compile_and_run(target="system-verilog", simulator="ncsim",
directory="tests/build/",
include_verilog_libraries=libs,
+ flags=["-sv"],
skip_compile=True)
else:
libs = ["DW_fp_mult.v", "DW_fp_add.v"]
https://github.com/rdaly525/coreir/issues/972
CoreIR has the ieee_complicance bit set to false when instantiating the CW and DW fp units. Changing this to 1 fixes this lassen issue. I'm not sure why it was set to 0 in the first place, but hopefully that will get resolved soon.
Email message from Alex back in June 2019 about this issue:
"Ran synthesis with ieee_compliance on… it increases pe tile area by about 10% , and it also affects our critical path for the PE. It did not meet timing at 454 MHz (2.2ns period), whereas the core without ieee_compliance did.
Alex"
I suspect the reason we wanted to disable denormal numbers was due to the increase in area and longer critical paths.
The PD team should decide if this area/critical path increase is possible before we switch it. Otherwise we should disable any tests that contain denormalized numbers.
I don't think the test case here is related to denorm issue. 0x8000
is negative 0, a behavior specified by IEEE standard, as seen here: https://en.wikipedia.org/wiki/Signed_zero
The test case basically means if we have -0 + -0
, we will get x
.
Based on CW fp_add
documentation, when ieee_compliance=0
, this is expected behavior:
A sA.expA.sigA B sB.expB.sigB Infinitely Precision Result (F)
± Zero ± Zero ± Zero
The documentation also states that -0
is not a denormal number.
Sign (s) Exponent (exp) Significand (sig) Value Short Name
s 0 0 (-1)s0 ±Zero
s 0 0 < sig (-1)s 21-bias0.sig Denormal
I see. Are the X's originating from the CW_fp_add module itself? Or are we getting Xs as a result of some logic in lassen
I did cross-examination on the documentation and waveforms, and I'm leaning towards stating that this is a bug in the CW_fp_add
simulation IP.
For those of you who have access to Cadence support, this is the link to the documentation.
Here is the signals coming in/out from the IP:
The status
flag is set properly: 01
means zero. However, the z
is set to x
.
I do not have access to the documentation, but I wonder if this is specified in the documentation. It might say that if the status indicates 0, then the z output could be anything (although that is a very stupid design decision)
I don't think it's specified anywhere in the documentation. 0x0000 + 0x000
gives correct z
and status
.
Here is my take on how to fix this:
ieee_compliance
. this is ideal if timing/area allowedstatus
indicates zero.I suspect 2) would be a cheaper (area/timing) solution. I can stamp out a mux (with +0) in the CoreIR implementation. Tests still might break if they are looking for a -0 though.
I asked Kathleen to synthesize PE with ieee_compliance
on. I will report back the result once it's finished.
Did more testing and it seems like 0 + 0
returns x
as well. -0 + 0
, however, returns the correct result.
Targeting at 1.1ns
clock speed, here is the timing/area comparison. Notice that both CW_fp_add
and CW_fp_mult
are modified.
TL;DR; timing is similar; area is less than 2% increase.
ieee_compliance=1
:
8 Slack (ps): -4
9 R2R (ps): 0
10 I2R (ps): -4
11 R2O (ps): 66
12 I2O (ps): -0
13 CG (ps): 1
ieee_compliance=0
:
8 Slack (ps): -6
9 R2R (ps): 0
10 I2R (ps): -6
11 R2O (ps): 72
12 I2O (ps): 0
13 CG (ps): 0
@rdaly525 I got green light to turn on ieee_compliance
. Would you please make necessary changes sometime today or tomorrow so that we can test more? Thanks.
Due to the PD environment change, we no longer support DW float IP. As a result, all the tests need to migrate to CW IPs.