accel-sim / accel-sim-framework

This is the top-level repository for the Accel-Sim framework.
https://accel-sim.github.io
Other
273 stars 105 forks source link

gpgpusim.config file generated from tuner containing improper values #293

Closed shkailas1 closed 2 months ago

shkailas1 commented 3 months ago

I wanted to create a config file for my GPU (GTX1080ti), and I was able to generate the following file using the tuner (attached). However there were some values left as 'X' (I assume this is because they cannot be inferred or do not apply to my gpu like gpgpu_tensor_cores_avail). Those that did not apply were either changed to 0 (if also 0 in the TITANX gpgpu.config file) or commented out (if i could not find the same field in the TITANX gpgpu.config file). Rerunning the simulator, I found that I kept receiving the error that 48 is an invalid value (perhaps from the -gpgpu_unified_l1d_size value): image When commenting the "L1/shared memory configuration" section of paramters, I get the following different error: image which makes it seem like the simulator is running but perhaps has a memory issue (maybe because I am simulating my GPU with my own GPU)

Could you please advise on how I should proceed? Thank you for your time! gpgpusim.config

JRPan commented 3 months ago

48 is a error because gpgpu_shmem_option is empty. remove gpgpu_shmem_option but keep gpgpu_unified_l1d_size.

I don't see the errors in your second file. Inspect the *.e files to see what the error message is.

shkailas1 commented 3 months ago

Hi, thank you for the help, doing this change made 8/10 tests had "no error" instead of passing. These two tests failed: nn-rodinia-2.0-ft streamcluster-rodinia-2.0-ft

This is what the error looked like in their *.e files:

home/runner/accel-sim/accel-sim-framework/util/job_launching/../../sim_run_11.0/nn-rodinia-2.0-ft/data_filelist_4_3_3090data_filelist_4_3_30_90_result_txt/GTX1080ti/slurm.sim: line 52: 6629 Killed /home/runner/accel-sim/accel-sim-framework/util/job_launching/../../sim_run_11.0/gpgpu-sim-builds/accelsim-commit-2260456ea5e6a1420f5734f145a4b7d8ab1d4737_modified_2.0/accel-sim.out -config ./gpgpusim.config -trace ./traces/kernelslist.g

JRPan commented 3 months ago

looks like you are out of memory?