Open SoroushHeidari opened 4 months ago
-gpgpu_concurrent_kernel_sm 1
to your config file. But your kernels must be small enough. If kernels are just big to fill all SMs then only that kernel will run. You can change the behavior by changing select_kernel
. select_kernel
function to issue one kernel to only a subset of SMs. You also probably want to checkout https://github.com/accel-sim/accel-sim-framework/tree/dev-stream-stats. By default, all stats are aggregated which does not make sense if you have concurrency. This branch changed that and stats are collected per-stream. This needs to be paired with this branch of gpgpu-sim. https://github.com/accel-sim/gpgpu-sim_distribution/tree/stream-stats
Thank you for getting back to me quickly! I have a question regarding simulating DNN inference and training using Accel-sim. I would like to exclude the first few initial iterations, commonly referred to as "warm-up" iterations, from the final stats report. Is there a way to do this?
Thank you for your contribution. I wanted to ask if you have any methods to simulate concurrent execution on GPU. NVIDIA provides three concurrency mechanisms to support concurrent applications: priority streams, time-slicing, and multi-process server (MPS). Is there a way to emulate the concurrency behavior for a simulator? If not, do you any suggestion to approach this problem?