sstsimulator / sst-elements

SST Architectural Simulation Components and Libraries
http://www.sst-simulator.org
Other
91 stars 119 forks source link

run .py with balar #2160

Open chaojibendan opened 1 year ago

chaojibendan commented 1 year ago

When I try to run sst testBalar-testcpu.py --model-options='-c gpu-v100-mem.cfg -v -x vectorAdd/vectorAdd -t cuda_calls.trace', I also encountered the same issue. SST-GPGPUSim Integration #2090 “sst: symbol lookup error: /wafer/wyy/software/sst-elements-library-13.0.0/build/lib/sst-elements-library/libbalar.so: undefined symbol: SST_gpgpusim_numcores_equal_check” I followed the instructions in https://github.com/sstsimulator/sst-elements/blob/master/src/sst/elements/balar/README.md to build Balar. I don't know how to run 'setup_environment' script correctly. It seems like that there is a problem in source setup_environment sst . Before that, I need to explain the modifications I made for building balar. GCC:11.3.0 CUDA:11.8 SST:13.0.0 When I install GPGPU-Sim, Replace line 47 in setup_environment with CC_VERSION=11.3.0 to fix the issue with regex not correctly recognizing CC_VERSION, Replace line 46 in version_detection.mk with CC_VERSION := 11.3.0 and Comment out line 49. Then I run source setup_environment sst. Then I add the following line below line 170 in the Makefile: if [ ! -f $(SIM_LIB_DIR)/libcudart_mod.so ]; then ln -s libcudart.so $(SIM_LIB_DIR)/libcudart_mod.so; fi This is because of an issue with the check in Balar. In the src/cuda-sim/Makefile file, I make modifications based on the instructions provided in the following link: (https://github.com/accel-sim/gpgpu-sim_distribution/pull/42/files.) This modification addresses the compatibility issue with newer versions of Bison, such as Bison 3.8.2, which may cause compilation errors.

hughes-c commented 1 year ago

undefined symbol: SST_gpgpusim_numcores_equal_check”happens if the gpgpusim environment is not properly initialized or if balar is not built correctly. During configure, did it say that balar was being built? If configure is successful, you should see:

cuda : YES 
balar : YES
chaojibendan commented 1 year ago

yes, I'm sure. Configuration Information (Found Dependencies):

                             boost : No
                              chdl : No
                              cuda : YES
                           dramsim : No
                          dramsim3 : No
                             dumpi : No
                             fdsim : No
                              glpk : No
                     goblin_hmcsim : No
                          gpgpusim : YES
                        hbmdramsim : No
                         hybridsim : No
                              libz : YES
                              llvm : No
                             metis : No
                         nvdimmsim : No
                               otf : No
                              otf2 : No
                           pintool : No
                 ptrace_set_tracer : YES
                              qsim : No
                         ramulator : No
                               shm : YES
                          spinlock : No
                             stake : No

Configuration Information (Make will build the following elements):

                             ariel : YES
                             balar : YES
                       cacheTracer : YES
                           cassini : YES
                           CramSim : YES
                             ember : YES
                           firefly : YES
                            hermes : YES
                          kingsley : YES
                      memHierarchy : YES
                           mercury : YES
                            merlin : YES
                           Messier : YES
                           miranda : YES
                               mmu : YES
                          prospero : YES
                           rdmaNic : YES
                             Samba : YES
                            shogun : YES
              simpleElementExample : YES
                         thornhill : YES
                           vanadis : YES
                         VaultSimC : YES
                            zodiac : YES

hughes-c commented 1 year ago

I can't replicate your problem with setup environment. The symbol error is usually caused by not being able to find the gpgpu-sim libraries (problem with the environment).

chaojibendan commented 1 year ago

The problem might be here?

source setup_environment sst
echo $LD_LIBRARY_PATH

The output is /wafer/wyy/software/gpgpu-sim_distribution/lib/gcc-11.3.0/cuda-11080/sst: But there is no such directory, only /wafer/wyy/software/gpgpu-sim_distribution/lib/gcc-11.3.0/cuda-11080/release.

chaojibendan commented 1 year ago

Maybe I know where the problem is. SST 13.0.0 still uses sst-gpgpusim,not gpgpusim. So when I use gpgpusim, I can build balar. But I can't trace cuda api because cuda_calls.trace is pointing to sst-gpgpusim instead of gpgpusim. Now gpgpusim supports CUDA 11, while sst-gpgpusim only supports CUDA 10. Additionally, there are certain issues with compiling bison and gcc at higher versions, as I made modifications during my initial compilation. Perhaps these are problems that can be addressed in the future.

hughes-c commented 1 year ago

@William-An Ah. Yeah, sst-gpgpusim is supposed to be merged with gpusim at some point. I will check on that status.