Open mbstrange2 opened 3 years ago
@mbstrange2, Ill take a look
Can you try running pytest with the "-s" flag to see if there's a CoreIR error message being dumped?
@leonardt I haven't been using pytest, just normal python, so I believe whatever output should be there, right?
This is the output I get when running the test:
~/repos/garnet spVspV*
garnet-venv ❯ PYTHONPATH=. python tests/test_memory_core/test_memory_core.py
/home/lenny/repos/garnet/garnet-venv/src/peak/peak/mapper/mapper.py:229: SyntaxWarning: "is" with a literal. Did you mean "=="?
assert arch_binding[0][1] is ()
/home/lenny/repos/garnet/garnet-venv/src/peak/peak/mapper/mapper.py:236: SyntaxWarning: "is" with a literal. Did you mean "=="?
assert ir_binding[0][1] is ()
/home/lenny/repos/garnet/garnet-venv/src/peak/peak/mapper/utils.py:198: SyntaxWarning: "is" with a literal. Did you mean "=="?
assert binding[0][1] is ()
/home/lenny/repos/garnet/garnet-venv/src/peak/peak/mapper/utils.py:199: SyntaxWarning: "is" with a literal. Did you mean "=="?
if len(binding)==1 and binding[0][1] is ():
/home/lenny/repos/garnet/garnet-venv/src/peak/peak/mapper/utils.py:246: SyntaxWarning: "is" with a literal. Did you mean "=="?
assert arch_path is ()
/home/lenny/repos/garnet/garnet-venv/src/lake/lake/passes/passes.py:29: SyntaxWarning: "is" with a literal. Did you mean "=="?
if port_name is "mode":
/home/lenny/repos/garnet/garnet-venv/src/lake/lake/utils/util.py:131: SyntaxWarning: "is" with a literal. Did you mean "=="?
if pdir is "input":
/home/lenny/repos/garnet/garnet-venv/src/lake/lake/utils/util.py:240: SyntaxWarning: "is" with a literal. Did you mean "=="?
if pdir is "input":
Getting length on class SparseSequenceConstraints.ZERO
Getting length on class SparseSequenceConstraints.ZERO
NEW TEST
len1=0
len2=0
num_match=0
SEQA: []
SEQB: []
DATA0: []
DATAD0: []
DATA1: []
DATAD1: []
common coords: []
result data: []
ALIGNED LENGTH 0: 0
ALIGNED LENGTH 1: 0
ADATA0: []
ADATAD0: []
ADATA1: []
ADATAD1: []
Variable: back_empty has no sink
Variable: back_full has no sink
Variable: front_empty has no sink
Variable: front_full has no sink
Variable: rd_valid has no sink
--------------------------------------------------------------------------------
/home/lenny/repos/garnet/garnet-venv/src/lake/lake/modules/strg_RAM.py:104
self._rd_bank = self.var("rd_bank", max(1, clog2(self.banks)))
self.set_read_bank()
> self._rd_valid = self.var("rd_valid", 1)
self.set_read_valid()
if self.fw_int == 1:
--------------------------------------------------------------------------------
Use anneal_param_factor 120
HPWL: 12.668244
HPWL: 10.684666
Using HPWL: 10.684666
Before annealing energy: 359.644200
After annealing energy: 4.487500 improvement: 0.98752293/3293 | 328.9 kHz | 0s<0s]
terminate called after throwing an instance of 'std::runtime_error'
what(): error in assign clb cells got cell type j
Traceback (most recent call last):
File "tests/test_memory_core/test_memory_core.py", line 1162, in <module>
spVspV_regress(dump_dir="mek_dump",
File "tests/test_memory_core/test_memory_core.py", line 1133, in spVspV_regress
success = run_test(len1, len2, num_match, value_limit, dump_dir=dump_dir, log_name=log_name, trace=trace)
File "tests/test_memory_core/test_memory_core.py", line 1069, in run_test
out_coord, out_data = spVspV_test(trace=trace,
File "tests/test_memory_core/test_memory_core.py", line 926, in spVspV_test
placement, routing = pnr(interconnect, (netlist, bus), cwd=cwd)
File "/home/lenny/repos/garnet/garnet-venv/src/archipelago/archipelago/pnr_.py", line 82, in pnr
place(packed_file, layout_filename, placement_filename, has_fixed)
File "/home/lenny/repos/garnet/garnet-venv/src/archipelago/archipelago/place.py", line 16, in place
subprocess.check_call([placer_binary, layout_filename,
File "/home/lenny/miniconda3/lib/python3.8/subprocess.py", line 364, in check_call
raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['/home/lenny/repos/garnet/garnet-venv/lib/python3.8/site-packages/placer', '/home/lenny/repos/garnet/mek_dump/design.layout', 'mek_dump/design.packed', 'mek_dump/design.place']' died with <Signals.SIGABRT: 6>.
Looks like some issue related to placement?
Runnning python garnet.py -v
works without error for me, so I suspect there's some differences in our setups. Are there any local changes to garnet/lake or other dependencies that might not have been pushed yet?
@leonardt Sorry about that, you need updated cyclone, thunder, and canal, then
export DISABLE_GP=1
Ok, I had to manually install the latest master branch from the cgra_pnr repo. I'm able to run the test and get verilog generated
garnet-venv ❯ python tests/test_memory_core/test_memory_core.py
Getting length on class SparseSequenceConstraints.ZERO
Getting length on class SparseSequenceConstraints.ZERO
NEW TEST
len1=0
len2=0
num_match=0
SEQA: []
SEQB: []
DATA0: []
DATAD0: []
DATA1: []
DATAD1: []
common coords: []
result data: []
ALIGNED LENGTH 0: 0
ALIGNED LENGTH 1: 0
ADATA0: []
ADATAD0: []
ADATA1: []
ADATAD1: []
Variable: back_empty has no sink
Variable: back_full has no sink
Variable: rd_valid has no sink
--------------------------------------------------------------------------------
/home/lenny/repos/garnet/garnet-venv/src/lake/lake/modules/strg_RAM.py:104
self._rd_bank = self.var("rd_bank", max(1, clog2(self.banks)))
self.set_read_bank()
> self._rd_valid = self.var("rd_valid", 1)
self.set_read_valid()
if self.fw_int == 1:
--------------------------------------------------------------------------------
Variable: front_empty has no sink
Variable: front_full has no sink
90.000000 -> 81.000000 improvement: 0.100000 total: 0.000000 | 675.9 kHz | 0s<0s]
81.000000 -> 81.000000 improvement: 0.000000 total: 0.100000 | 442.3 kHz | 0s<0s]
using bit_width 1
Routing iteration: 0 duration: 20 ms
using bit_width 16
Routing iteration: 0 duration: 6 ms
[(4, 16), (83, 134217728), (83, 33554432), (4, 2), (83, 16777216)]
[(3, -16), (2, 0), (2, -2), (1, 0), (0, 65536), (1, 65536), (0, 0), (3, 1048576)]
[(4, 16), (83, 134217728), (83, 33554432), (4, 2), (83, 16777216)]
[(3, -16), (2, 0), (2, -2), (1, 0), (0, 65536), (1, 65536), (0, 0), (3, 1048576)]
Config isect core.....!
[(0, 256)]
[(4, 16), (83, 134217728), (83, 33554432), (4, 2), (83, 16777216)]
[(4, 16), (83, 134217728), (83, 33554432), (4, 2), (83, 16777216)]
[(0, 64), (4, 1), (83, 16777216), (83, 134217728), (4, 16)]
[(0, 64), (4, 1), (83, 16777216), (83, 134217728), (4, 16)]
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
Generating LALR tables
WARNING: 183 shift/reduce conflicts
WARNING:magma:Wiring multiple outputs to same wire, using last connection. Input: Interconnect.Tile_X02_Y01.clk, Old Output: Interconnect.Tile_X02_Y00.clk_out, New Output: Interconnect.Tile_X01_Y01.clk_pass_through_out_right
WARNING:magma:Wiring multiple outputs to same wire, using last connection. Input: Interconnect.Tile_X04_Y01.clk, Old Output: Interconnect.Tile_X04_Y00.clk_out, New Output: Interconnect.Tile_X03_Y01.clk_pass_through_out_right
WARNING:magma:Wiring multiple outputs to same wire, using last connection. Input: Interconnect.Tile_X06_Y01.clk, Old Output: Interconnect.Tile_X06_Y00.clk_out, New Output: Interconnect.Tile_X05_Y01.clk_pass_through_out_right
mek_dump/Interconnect.json
Running command: verilator -Wall -Wno-INCABSPATH -Wno-DECLFILENAME -Wno-fatal --cc Interconnect.v -v cfg_and_dbg_unq1.sv -v tap_unq1.sv -v jtag.sv -v glc_axi_ctrl.sv -v flop_unq1.sv -v flop_unq3.sv -v flop_unq2.sv -v glc_jtag_ctrl.sv -v global_controller.sv -v glc_axi_addrmap.sv -v CW_fp_add.v -v CW_fp_mult.v -v AN2D0BWP16P90.sv -v AO22D0BWP16P90.sv --exe Interconnect_driver.cpp --top-module Interconnect
Perhaps there's some difference in our setup still.
Can you show the pycoreir version and check if there's multiple version of coreir in your path with
pip show pycoreir
and
which -a coreir
here's what I have
~/repos/garnet spVspV*
garnet-venv ❯ pip show coreir
Name: coreir
Version: 2.0.128
Summary: Python bindings for CoreIR
Home-page: https://github.com/leonardt/pycoreir
Author: Leonard Truong
Author-email: lenny@cs.stanford.edu
License: BSD License
Location: /home/lenny/repos/garnet/garnet-venv/lib/python3.8/site-packages
Requires: hwtypes
Required-by: CoSA, magma-lang, fault, peak, metamapper
~/repos/garnet spVspV*
garnet-venv ❯ which -a coreir
/home/lenny/repos/garnet/garnet-venv/bin/coreir
/home/lenny/miniconda3/bin/coreir
Can you make sure to target xcelium? I'm not sure if there's any difference if you choose a different simulator target.
(aha) root@615a6684288f:/aha/garnet# pip show coreir
Name: coreir
Version: 2.0.128
Summary: Python bindings for CoreIR
Home-page: https://github.com/leonardt/pycoreir
Author: Leonard Truong
Author-email: lenny@cs.stanford.edu
License: BSD License
Location: /aha/pycoreir
Requires: hwtypes
Required-by: CoSA, magma-lang, peak, fault
(aha) root@615a6684288f:/aha/garnet# which -a coreir
/usr/local/bin/coreir
The verilator compilation failed with a huge amount of errors, here's a snippet:
| ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8143:7: error: ‘io2glb_1_X06_Y00’ was not declared in this scope
8143 | if (io2glb_1_X06_Y00) {
| ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8166:7: error: ‘io2glb_1_X01_Y00’ was not declared in this scope
8166 | if (io2glb_1_X01_Y00) {
| ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8169:7: error: ‘io2glb_1_X06_Y00’ was not declared in this scope
8169 | if (io2glb_1_X06_Y00) {
| ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8192:7: error: ‘io2glb_1_X01_Y00’ was not declared in this scope
8192 | if (io2glb_1_X01_Y00) {
| ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8195:7: error: ‘io2glb_1_X06_Y00’ was not declared in this scope
8195 | if (io2glb_1_X06_Y00) {
| ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8218:7: error: ‘io2glb_1_X01_Y00’ was not declared in this scope
8218 | if (io2glb_1_X01_Y00) {
| ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8221:7: error: ‘io2glb_1_X06_Y00’ was not declared in this scope
8221 | if (io2glb_1_X06_Y00) {
| ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8244:7: error: ‘io2glb_1_X01_Y00’ was not declared in this scope
8244 | if (io2glb_1_X01_Y00) {
| ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8247:7: error: ‘io2glb_1_X06_Y00’ was not declared in this scope
8247 | if (io2glb_1_X06_Y00) {
| ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8270:7: error: ‘io2glb_1_X01_Y00’ was not declared in this scope
8270 | if (io2glb_1_X01_Y00) {
| ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8273:7: error: ‘io2glb_1_X06_Y00’ was not declared in this scope
8273 | if (io2glb_1_X06_Y00) {
| ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8296:7: error: ‘io2glb_1_X01_Y00’ was not declared in this scope
8296 | if (io2glb_1_X01_Y00) {
| ^~~~~~~~~~~~~~~~
../Interconnect_driver.cpp:8299:7: error: ‘io2glb_1_X06_Y00’ was not declared in this scope
8299 | if (io2glb_1_X06_Y00) {
I wonder if the large amount of errors is causing a segfault in the downstream tool?
I'll try using xcelium to see if there's any difference
Can you check conftest.py
in garnet and make sure to set skip_compile=False
? I might have pushed the code with it true in which case there's no verilog being produced.
skip_compile is False in conftest
I was looking at the test code and noticed:
1029 tester_if = tester._if(circuit.interface[cvalid])
I think it should be
1029 tester_if = tester._if(tester.peek(circuit.interface[cvalid]))
Since you need to use the tester.peek
function when referring to a circuit port (when not using the tester.circuit
interface)
There must be some other mismatch in our envs. This worked for me when using old generated verilog and ran fine in xcelium.
Also, I don't think changing the simulator target (to xcelium) would affect the verilog code generation. If you can't generate code with python garnet.py -v
(without using the test), then this suggests that there's still some difference in our setup since I can generate the verilog fine.
I can generate the verilog with python garnet.py -v
, just trying to figure out why it fails for me and Keyi when we use the test.
Ah, I see, I misread the original post then, let me investigate with the xcelium target then
Changing the target doesn't seem to affect verilog code generation for me (I get a file in mek_dump, Interconnect.V), so I think there's still some difference in our environments
Okay this is somewhat great news then. The test ran and passed?
Here's my pip list
(aha) root@615a6684288f:/aha/garnet# pip list
Package Version Location
------------------- --------- ---------------------
aha 0.0.0 /aha
archipelago 0.0.8 /aha/archipelago
ast-tools 0.0.30 /aha/ast_tools
astor 0.8.1
attrs 20.3.0
buffer-mapping 0.0.5 /aha/BufferMapping
canal 0.0.0 /aha/canal
certifi 2020.12.5
chardet 4.0.0
colorlog 4.7.2
coreir 2.0.128 /aha/pycoreir
CoSA 0.4 /aha/cosa
dataclasses 0.6
DeCiDa 1.1.5
decorator 4.4.2
docker 4.4.1
fault 3.0.47 /aha/fault
gemstone 0.0.0 /aha/gemstone
genesis2 0.0.5
gitdb 4.0.5
GitPython 3.1.12
gmpy2 2.0.8
graphviz 0.16
hwtypes 1.4.4 /aha/hwtypes
idna 2.10
importlib-metadata 3.4.0
iniconfig 1.1.1
Jinja2 2.11.2
jmapper 0.2.0
kratos 0.0.32.3 /aha/kratos
lake-aha 0.0.4 /aha/lake
lassen 0.0.1 /aha/lassen
libcst 0.3.16
magma-lang 2.1.27 /aha/magma
Mako 1.1.4
mantle 2.0.16 /aha/mantle
MarkupSafe 1.1.1
mflowgen 0.3.0 /aha/mflowgen
mypy-extensions 0.4.3
networkx 2.5
numpy 1.19.5
ordered-set 4.0.2
packaging 20.9
peak 0.0.1 /aha/peak
pip 20.1.1
pluggy 0.13.1
ply 3.11
py 1.10.0
pycyclone 0.3.26 /aha/cgra_pnr/cyclone
pydot 1.4.1
pyparsing 2.4.7
PySMT 0.9.0
pysv 0.1.2
pytest 6.2.2
pythunder 0.3.26 /aha/cgra_pnr/thunder
pyverilog 1.3.0
PyYAML 5.4.1
requests 2.25.1
requirements-parser 0.2.0
scipy 1.6.0
setuptools 47.1.0
six 1.15.0
smmap 3.0.5
staticfg 0.9.5
tabulate 0.8.7
toml 0.10.2
typing-extensions 3.7.4.3
typing-inspect 0.6.0
urllib3 1.26.3
websocket-client 0.57.0
wheel 0.36.2
z3-solver 4.8.10.0
zipp 3.4.0
I'm able to generate verilog and the test runs xrun but then fails with some errors. Here are the relevant *E snippets from xrun.log
660 xmvlog: *E,DUPIDN (Interconnect.v,5493|18): identifier 'exp_bits' previously declared [12.5(IEEE)].
661 localparam frac_bits = 7;
662 |
663 xmvlog: *E,DUPIDN (Interconnect.v,5494|19): identifier 'frac_bits' previously declared [12.5(IEEE)].
664 module worklib.mul:v
665 errors: 2, warnings: 0
1765 xmvlog: *E,DUPIDN (global_buffer_int.sv,129|45): identifier 'glb_config_rd_data' previously declared [12.5(IEEE)].
1766 module worklib.global_buffer_int:sv
1767 errors: 1, warnings: 0
But it does not segfault at any point
You're having it use cadence ware (CW)? Those errors are in the PE so I'm even more confused.
I haven't changed anything, looking at the generated code though, it looks out of date so possibly some different coreir version is being used
Ah yes, my version of python on kiwi is old (3.7) so it's installing an older version of coreir, going to upgrade it to 3.8
Hmm that wasn't the problem, it actually seemed to be the right version of coreir and still getting the same output
Thanks for looking into this, I can help later today if this is still not resolved.
Hmmm not sure what to do then.
When looking at the generated mek_dump/Interconnect.v
on my local machine, I'm getting a different output (without localparam error), so it seems that something on kiwi is causing me to generate different verilog.
Ok, figured it out. There was a leftover old version of coreir in my LD_LIBRARY_PATH, you may want to check that out (maybe there's an old version of the library being used). This was causing the old float code library to be loaded and affecting the verilog output). Now I just get this error from the global buffer:
1118 xmvlog: *E,DUPIDN (global_buffer_int.sv,129|45): identifier 'glb_config_rd_data' previously declared [12.5(IEEE)].
1119 module worklib.global_buffer_int:sv
I'm going to try patching it locally to see if the test will run
Okay, I resolved the global_buffer_int problem, it looks like the test bench was copying the entire contents of genesis_verif directory. It turns out that directory had some old genesis files from an older version of garnet that was being copied in and causing the error. Purging the directory resolved that issue (now I'm getting the xcelium license issue so I'm trying again with the older version that works)
Ok so the simulation completes but then fails during the results parsing with:
xcelium> run 10000ns
COORD: 0, VAL: x
COORD: 0, VAL: x
COORD: 0, VAL: x
COORD: 0, VAL: x
COORD: 0, VAL: x
COORD: 0, VAL: x
COORD: 0, VAL: x
COORD: 0, VAL: x
COORD: 0, VAL: x
COORD: 0, VAL: x
COORD: 0, VAL: x
COORD: 0, VAL: x
COORD: 0, VAL: x
Simulation complete via $finish(1) at time 3541 NS + 0
./Interconnect_tb.sv:3668 #20 $finish;
xcelium> assertion -summary -final
Summary report deferred until the end of simulation.
xcelium> quit
No assertions found.
xmsim: *N,PRASRT: Protected assertions are not shown.
TOOL: xrun(64) 19.03-s003: Exiting on Feb 02, 2021 at 13:33:17 PST (total: 00:00:24)
</STDOUT>
Traceback (most recent call last):
File "tests/test_memory_core/test_memory_core.py", line 1162, in <module>
spVspV_regress(dump_dir="mek_dump",
File "tests/test_memory_core/test_memory_core.py", line 1133, in spVspV_regress
success = run_test(len1, len2, num_match, value_limit, dump_dir=dump_dir, log_name=log_name, trace=trace)
File "tests/test_memory_core/test_memory_core.py", line 1089, in run_test
data_sim = [int(x[3]) for x in split_lines]
File "tests/test_memory_core/test_memory_core.py", line 1089, in <listcomp>
data_sim = [int(x[3]) for x in split_lines]
ValueError: invalid literal for int() with base 10: 'x'
But I think I'm much further then necessary. It looks like I'm able to generate the verilog and run the test totally fine without a segfault so let's see what's different about your environment. Can you post the output of your $PATH and $LD_LIBRARY_PATH? Let's make sure there's no old versions of coreir lying around there. Also, is your coreir version installed via pip? Or do you have a local installation from a checkout of the pycoreir repo?
Ah, I see that you have coreir installed from a local location: coreir 2.0.128 /aha/pycoreir
Can we double check this setup by either recompiling it to ensure it's up to date or uninstalling this version and using the pip distribution?
This is in the aha docker - if you want to attach to it mstrange-gracious_visvesvaraya
and check it out that might be easier? Or startup another docker?
docker attach mstrange-gracious_visvesvaraya
hangs for me, I wonder if only one person can be attached at a time? or if there's a user permissions issue
Hmm wait, nevermind, hitting ctrl-c dropped me into the shell, maybe it was just waiting for a command
You just need to hit enter - it doesn't automatically show the prompt for some reason lol
Hm the tests seem to be running for me, it seems to be running more than one though so I have finished all of them
Have you tried simply reattaching to the container? Perhaps there's some leftover config in your env causing the problem? How many tests is this supposed to run? I'm still waiting for it to finish but it seems to be running xcelium multiple times so it doesn't seem to be having any problems generating the verilog.
Oh I'm sorry one second I have skip_compile=True in there
Okay if you run it again in the docker it will segfault
Seemed to have "worked around" the issue by uninstalling coreir, and installing the pypi distribution. So something about the local docker setup is likely at fault
cd /aha/coreir/build
make uninstall
pip uninstall coreir
pip install coreir
I reinstalled coreir and it causes the segfault so something about the local build is causing the problem
Hmm, I tried reverting coreir to an older commit to match up with the pycoreir release (which is a few commits behind coreir master) but still the same problem, which suggests it's not an issue with any of the recent changes (also reviewing the commits shows nothing that would suggest a seg fault, they are minor)
@mbstrange2 does that workaround work for unblocking you for now? We'll need to investigate the docker environment more closely to see what would be causing this issue with the local build versus using the pip wheel distribution
Where is the docker environment specified?
@leonardt This workaround is good for me at present
@rdaly525 https://hub.docker.com/r/stanfordaha/garnet it's this docker - it should be created from https://github.com/StanfordAHA/aha
I am experiencing an issue where any attempts to generate a Verilog for my test bench is segfaulting.
This error can be reproduced by checking out
lake:sparse_strawman
andgarnet:spVspV
and runningpython tests/test_memory_core/test_memory_core.py
ingarnet
.This is the output I see when trying to generate the verilog in this context.