bespoke-silicon-group / bsg_manycore

Tile based architecture designed for computing efficiency, scalability and generality
Other
227 stars 58 forks source link

Difficulty setting up and running tests #722

Open Subashkatel opened 3 months ago

Subashkatel commented 3 months ago

Hello,

I'm new to this project and having a difficult time setting up and running the testbenches. I'm stuck at the make machines step and I don't seem to be able to get further to actually run any tests. I did see the other post #650 having same problem but didn't seem to reach conclusion on what was the issue.

Current Status:

Specific Issues:

  1. I don't have access to VCS (also want to be able to run it fully with open source tools) so I'm trying to use verilator and it fails on make machines command, with first there were some python version issues once I fixed that then timescale related issues.
  2. I'm don't know how to correctly setup the environment required to run the tests successfully.

Error Output:

**bsg_manycore/machines' Makefile.vcs:8: VCS must be set; probably need VCS_HOME, SNPSLMD_LICENSE_FILE too, maybe LM_LICENSE file Makefile.xcelium:8: XRUN must be set; probably need XRUN_HOME, SNPSLMD_LICENSE_FILE too, maybe LM_LICENSE file # specify where the host module is instantiated for profiler trigger (print_stat). # relative to oot # These define are required by mobile_ddr.v. # density = 2048 Mbit # speed grade = 5 # organization = x16 # allocation = FULL_MEM python 

/bsg_manycore/testbenches/py/pod_trace_gen.py 1 1 > pod_1x1_4X2Y/pod_trace.tr python 

basejump_stl/bsg_mem/bsg_ascii_to_rom.py pod_1x1_4X2Y/pod_trace.tr bsg_tag_boot_rom > pod_1x1_4X2Y/bsg_tag_boot_rom.v Traceback (most recent call last): File 

"/basejump_stl/bsg_mem/bsg_ascii_to_rom.py", line 54, in <module> hstr = '%0*X' % ((len(digits_only) + 3) // 4, int(digits_only, 2)) ^^^^^^^^^^^^^^^^ TypeError: object of type 'filter' has no len() make: *** [Makefile.verilator:91: pod_1x1_4X2Y/simsc] Error 1 make: Leaving directory '/bsg_manycore/machines' (base)

----------------------------------------------------------------------------

%Error-TIMESCALEMOD: 
/basejump_stl/bsg_noc/bsg_noc_pkg.sv:3:9: Timescale missing on this module as other modules have it (IEEE 1800-2017 3.14.2.2) /basejump_stl/bsg_test/bsg_nonsynth_clock_gen.sv:11:8: ... Location of module with timescale 11 | module bsg_nonsynth_clock_gen | ^~~~~~~~~~~~~~~~~~~~~~ %Error-TIMESCALEMOD: 
/basejump_stl/bsg_noc/bsg_mesh_router_pkg.sv:8:9: Timescale missing on this module as other modules have it (IEEE 1800-2017 3.14.2.2) 
/basejump_stl/bsg_test/bsg_nonsynth_clock_gen.sv:11:8: ... Location of module with timescale 11 | module bsg_nonsynth_clock_gen | ^~~~~~~~~~~~~~~~~~~~~~ %Error-TIMESCALEMOD: 
/basejump_stl/bsg_noc/bsg_wormhole_router_pkg.sv:3:9: Timescale missing on this module as other modules have it (IEEE 1800-2017 3.14.2.2) 
/basejump_stl/bsg_test/bsg_nonsynth_clock_gen.sv:11:8: ... Location of module with timescale 11 | module bsg_nonsynth_clock_gen | ^~~~~~~~~~~~~~~~~~~~~~ %Error-TIMESCALEMOD: 
/basejump_stl/bsg_tag/bsg_tag_pkg.sv:1:9: Timescale missing on this module as other modules have it (IEEE 1800-2017 3.14.2.2) 
/basejump_stl/bsg_test/bsg_nonsynth_clock_gen.sv:11:8: ... Location of module with timescale 11 | module bsg_nonsynth_clock_gen | ^~~~~~~~~~~~~~~~~~~~~~ %Error-TIMESCALEMOD: 
/basejump_stl/bsg_cache/bsg_cache_pkg.sv:9:9: Timescale missing on this module as other modules have it (IEEE 1800-2017 3.14.2.2) 
/basejump_stl/bsg_test/bsg_nonsynth_clock_gen.sv:11:8: ... Location of module with timescale 11 | module bsg_nonsynth_clock_gen | ^~~~~~~~~~~~~~~~~~~~~~ %Error-TIMESCALEMOD: 
/basejump_stl/bsg_cache/bsg_cache_non_blocking_pkg.sv:9:9: Timescale missing on this module as other modules have it (IEEE 1800-2017 3.14.2.2)
...

What I've Tried:

Questions:

  1. Is there a contributers guide for setting up the environment for running testbenches?
  2. How can I at least run basic tests and verify if they're all passing?

I would really appreciate any help on how to go about this and set up the environment and run the testbenches.

dpetrisko commented 3 months ago

Hi, can you post: 1) OS 2) Python version 3) Verilator version 4) commit # (not branch name) for basejump_stl 5) commit # (not branch name) for bsg_manycore

The good news is this flow all usually "works on my machine". The bad news is that we don't maintain the Verilator build as tightly as VCS (which is our main flow) so sometimes dependencies lag

Subashkatel commented 3 months ago

@dpetrisko Here is my current setup:

If there was a guide on how to properly setup the environment and run individual tests and pass them that would be immensely helpful.

Thank you for your help!

Subashkatel commented 3 months ago

@dpetrisko I even update the verilator now there is different errors, I really want to get this to work would really appreciate your help.


%Error: /bsg_manycore/v/bsg_manycore_hetero_socket.sv:117:29: Instance attempts to override 'rev_fifo_els_p' as a parameter, but it is a local parameter
  117 |                           ,.rev_fifo_els_p(rev_fifo_els_p)
      |                             ^~~~~~~~~~~~~~
%Error: /bsg_manycore/v/bsg_manycore_hetero_socket.sv:118:29: Instance attempts to override 'fwd_fifo_els_p' as a parameter, but it is a local parameter
  118 |                           ,.fwd_fifo_els_p(fwd_fifo_els_p)
      |                             ^~~~~~~~~~~~~~
%Error: /bsg_manycore/v/bsg_manycore_hetero_socket.sv:118:29: Instance attempts to override 'rev_fifo_els_p' as a parameter, but it is a local parameter
  118 |                           ,.rev_fifo_els_p(rev_fifo_els_p)
      |                             ^~~~~~~~~~~~~~
%Error: Exiting due to 14 error(s)
%Error: Exiting due to 14 error(s)
make[1]: *** [Makefile.verilator:104: pod_1x1_4X2Y/simsc-debug] Error 1
make[1]: *** Waiting for unfinished jobs....
make[1]: *** [Makefile.verilator:92: pod_1x1_4X2Y/simsc] Error 1
dpetrisko commented 3 months ago

Thanks, this patch should fix that specific issue: https://github.com/bespoke-silicon-group/bsg_manycore/pull/723

dpetrisko commented 3 months ago

For Verilator 5+, looks like you also need this: https://github.com/bespoke-silicon-group/bsg_manycore/pull/724

Subashkatel commented 3 months ago

@dpetrisko I don't know if it worked for you after making those change but its still not working for me

make[2]: *** Waiting for unfinished jobs....
make[2]: *** write jobserver: Bad file descriptor.  Stop.
%Error: make -C pod_1x1_4X2Y/obj_dir -f Vspmd_testbench.mk exited with 2
%Error: Command Failed ulimit -s unlimited 2>/dev/null; exec /usr/local/share/verilator/bin/verilator_bin --top-module spmd_testbench --cc -Wno-fatal -O2 --build --exe -j 4 -CFLAGS -std=c++14\ -g\ -Wall -CFLAGS -O2 -CFLAGS -fPIC -CFLAGS -I/basejump_stl/imports/DRAMSim3/src -CFLAGS -I/basejump_stl/imports/DRAMSim3/ext/headers -CFLAGS -I/basejump_stl/imports/DRAMSim3/ext/fmt/include -CFLAGS -I/basejump_stl/bsg_test -CFLAGS -I../../verilator/include -CFLAGS -DFMT_HEADER_ONLY=1 -CFLAGS -DBASEJUMP_STL_DIR=/basejump_stl -Mdir pod_1x1_4X2Y/obj_dir -o simsc +incdir+ ...........
make[1]: *** [Makefile.verilator:104: pod_1x1_4X2Y/simsc-debug] Error 2
make[1]: Leaving directory '/bsg_manycore/machines'
make: *** [Makefile:12: machines] Error 2

Am I doing something wrong?

dpetrisko commented 3 months ago

Seems like a problem with your docker setup. Often this is due to threading. I would try adding VERILATOR_THREADS=1 to your command to force a single thread build.

Subashkatel commented 3 months ago

@dpetrisko Thank you for the help on the environment set up! I had questions about testing:

  1. Are there unit tests available for individual modules or components in addition to the full system tests? If so, how would I be able to run those individually?
  2. Is there a way for me to check the test coverage for the existing tests?
dpetrisko commented 3 months ago

1) The majority of modules comprising bsg_manycore come from Basejump STL, which is unit tested and multiply silicon validated.
2) AFAIK not presently for Verilator. we would love to accept a PR exposing aggregate coverage of the regression through verilator_coverage.

Subashkatel commented 3 months ago

@dpetrisko Thank you again for all the help! I had couple more questions about the testing process:

From what you mentiond and looking at the errors, it looks like most of the modules used in bsg_manycore, such as bsg_noc, bsg_tag, bsg_cache, etc. come from the basejump_stl library. You mentioned that those basejump_stl modules already have extensive unit tests.

Is there a way for me to run those existing unit tests on the specific basejump_stl modules that are instantiated within bsg_manycore? I'd like to be able to use those in with to the full system tests if possible.

I'm would like test the design by:

  1. Running relevant unit tests for the basejump_stl modules used in bsg_manycore
  2. Running targeted system tests in bsg_manycore that exercise a particular module
  3. Running the full regression test suite in bsg_manycore

Since you are already familiar with this I would really appreciate if you would be able to provide some concrete examples of the exact commands I would need to run for each of those scenarios? I'm not quite sure how to go about running an individual unit test or a system test that targets a specific module.

Also, is there documentation somewhere that outlines the overall structure and hierarchy of the test suites? Something that maps the various tests to the hdl modules they are intended to test would be really helpful as I try to debug and run things individually.

I really appreciate all the so far thank you so much!

Subashkatel commented 3 months ago

@dpetrisko sorry to keep bothering you but any help would be much appreciated

dpetrisko commented 3 months ago

You mentioned that those basejump_stl modules already have extensive unit tests.

You can find the basejump_stl unit tests here: https://github.com/bespoke-silicon-group/basejump_stl/tree/master/testing

The default testing infrastructure uses vivado or VCS, but it would be trivial to add Verilator support as well (would accept a PR!)

Is there a way for me to run those existing unit tests on the specific basejump_stl modules that are instantiated within bsg_manycore?

You can find the filelist of bsg_manycore here: https://github.com/bespoke-silicon-group/bsg_manycore/blob/master/machines/arch_filelist.mk