Open natewise opened 2 years ago
Hi @natewise, can you try this PR? https://github.com/bespoke-silicon-group/bsg_replicant/pull/787
Do you mean re-trying the command while in the vcs-verilator-svdpi-fix branch? I gave that a go but make regression still failed.
/mnt/c/Projects/bsg_bladerunner/bsg_replicant/libraries/bsg_manycore_features.h:34:2: error: #error "_BSD_SOURCE not defined: required for bsg_manycore_runtime" 34 | #error "_BSD_SOURCE not defined: required for bsg_manycore_runtime"
Ah, you're on Ubuntu
Well WSL 2 technically, but yes Ubuntu xD
We haven't run through the steps on Ubuntu in a while, it's on our todo list. @dpetrisko was going to take a look last week but I know he's busy.
To solve the error above, the following diff will work:
diff --git a/libraries/bsg_manycore_features.h b/libraries/bsg_manycore_features.h
index 271a4b66..f7b21730 100644
--- a/libraries/bsg_manycore_features.h
+++ b/libraries/bsg_manycore_features.h
@@ -29,10 +29,12 @@
#define BSG_MANYCORE_FEATURES_H
// <features.h> sorts out many of these defines based on compile time flags (e.g. -std=c++11)
#include <features.h>
-// check _BSG_SOURCE
+// check _BSD_SOURCE
#ifndef _BSD_SOURCE
+#ifndef _DEFAULT_SOURCE
#error "_BSD_SOURCE not defined: required for bsg_manycore_runtime"
#endif
+#endif
The next issue you will probably run into is a linking error with zlib1g-dev
, which I haven't solved yet. I think the link flags are in the wrong order. This is a workaround, for now:
diff --git a/libraries/platforms/bigblade-verilator/link.mk b/libraries/platforms/bigblade-verilator/link.mk
index 4b16d1c0..8db5b8d6 100644
--- a/libraries/platforms/bigblade-verilator/link.mk
+++ b/libraries/platforms/bigblade-verilator/link.mk
@@ -207,7 +207,6 @@ $(SIMSCS): %/simsc : %/bsg_manycore_simulator.o %/V$(BSG_DESIGN_TOP)__ALL.a
# regression tests can build them before launching parallel
# compilation and execution
REGRESSION_PREBUILD += $(BSG_MACHINExPLATFORM_PATH)/exec/simsc
-REGRESSION_PREBUILD += $(BSG_MACHINExPLATFORM_PATH)/debug/simsc
REGRESSION_PREBUILD += $(BSG_MACHINExPLATFORM_PATH)/profile/simsc
REGRESSION_PREBUILD += $(BSG_PLATFORM_PATH)/libbsgmc_cuda_legacy_pod_repl.so
REGRESSION_PREBUILD += $(BSG_PLATFORM_PATH)/libbsg_manycore_runtime.so
After that, the notation |&
isn't supported by ubuntu's default shell, but is used in our scripts. You can re-write it as 2>&1
Last week when we went down this path the executable segfaulted. We haven't had time to get that far yet, so your mileage may vary.
(We use Centos 7 for our development, but I've been meaning to try on Ubuntu for a while. Haven't had time)
I made all those changes and am running the command again. If all goes well, I'll close out this issue, otherwise I will send another comment. Thank you for your prompt responses, they were very much appreciated and helpful!
If they work, keep this issue open, but retitle to Ubuntu support
So I ran the command overnight, and it looks like my computer crashed sometime last night. But I just reran the command again and it looks like those steps got me along much further but still ended in an error:
It's likely because of how our regression works. It searches for a success message using grep and if it doesn't find it, it fails.
In other words, it is probably searching the stale log from when your computer crashed in test_vcache_flush. Make clean in that directory and try again.
You might also consider running the pod_X1Y1_ruche_X8Y4_hbm machine. It is faster to compile and simulate.
I went ahead and switched to the pod_X1Y1_ruche_X8Y4_hbm machine, did make clean and then reran make regression (it takes forever!), but unfortunately:
You can run make regression -jN
where N is some reasonable number.
Verilator is a lot slower than VCS at the moment. There are optimizations, but we haven't explored them due to lack of resources.
Ignore this failure for now. I need to look into this, and it should not be related to Verilator. I would say that your system is working.
If you want more assurance, try running make regression
from inside of the examples/cuda directory.
it should not be related to Verilator.
Presumably it's related to 8x4? It is concerning that is a hardware assertion and not a test failure though
I don't think it's related to 8x4. I think it's related to the icache, which now has to be written in series (and in blocks of 4 words). This test was likely not updated and we somehow didn't catch it.
Oh I see, this test may be manually writing I$ incorrectly and that would trigger the assertion. I was concerned because I thought it was a demand fill coming back out of order
Yeah, it's failing in VCS but not causing simulation to terminate. The test still passes. Will look into
Unfortunately neither command worked for me.
Here is the output of make regression -j4:
And here is the output of make regression in examples/cuda:
I bet the latter triggered the former, or there is probable cause.
In the end, I would call your installation working. My impression is there are some issues in our code that don't show up on our system so we'll have to spin up Ubuntu and iron them out there.
But you should be safe to develop.
However, if you're willing to poke a bit, can you comment out this line and re-run test_binary_load_buffer?
I went ahead and switched to the pod_X1Y1_ruche_X8Y4_hbm machine, did make clean and then reran make regression (it takes forever!), but unfortunately:
This issue is related to padding in the final RISC-V executable. I have diagnosed the issue and it should not show up in normal development. It does not cause VCS to fail.
(Breadcrumb for anyone following along, look at the output of nm
in bsg_manycore/software/spmd/bsg_loader_suite/loopback_big_text
. The text section is not aligned to 4 words/16 bytes In part this is because of the asm(".zero 4192"). With asm(".zero 4188") it passes.)
For reference I'm using the _pod_X1Y1_ruche_X16Y8hbm machine and the bigblade-verilator platform
Here is a screenshot of the error I'm getting:
Previous commands had the include _-I/mnt/c/Projects/bsgbladerunner/verilator/include/vltstd, which is where this svdpi.h is located:
But for some reason the command causing the issue doesn't have this include, which would seem to be the cause for the error that's happening.
I'm not entirely sure how to move forward in solving this issue since it seems to be caused by something internally, so I would appreciate any help. Thanks!