Closed dpetrisko closed 2 years ago
I would say launch a CI branch to make sure that all existing configurations pass.
@tommydcjung I launched a CI run, but it failed because the basejump PR has not merged yet. What would you suggest for such a situation?
I see. In this case, the regression needs to run locally and get both PR approved and merged. I can help with that. @drichmond can you run some regression on cuda-lite to validate that hbm2 dma mapping is not affected?
yeah, which PRs need to be run?
Found bug with pod_1x1_hbm2
Error-[EEST] $error elaboration system task
msg: WH len width 4 must be large enough to hold the dma transfer size 5
location: file
//basejump_stl/bsg_cache/bsg_wormhole_to_cache_dma_fanout.v
line 363
path: spmd_testbench.tb.hbm2.hs[1].py[0].py[3].row[1].rf[0].wh_to_dma
Thanks, fixed the bug. There was an incorrectly calculated DMA length which was exposed by the wider HBM channels
@drichmond can you run some regression with this PR and https://github.com/bespoke-silicon-group/basejump_stl/pull/410?
Yeah, seems to run fine once https://github.com/bespoke-silicon-group/bsg_replicant/pull/764 is applied.
This is just a refactor, not a functionality change right? No memory mapping changes?
I need to look into why interpod_memory_test is hanging for pod_4x4_hbm2. It's not caused by this PR. it was already hanging before.
Made a fix here https://github.com/bespoke-silicon-group/bsg_manycore/pull/626
Corresponds to https://github.com/bespoke-silicon-group/basejump_stl/pull/410