GPGPU-Sim provides a detailed simulation model of contemporary NVIDIA GPUs running CUDA and/or OpenCL workloads. It includes support for features such as TensorCores and CUDA Dynamic Parallelism as well as a performance visualization tool, AerialVisoin, and an integrated energy model, GPUWattch.
-gpgpu_perf_sim_memcpy 1 # Fill the L2 cache on memcpy
I found that this option is turned on defaultly in every configuration file, but I don't see why it's on by default.
With it on, when moving data from host to device by cudaMempy, gpgpusim will fill the L2 cache automatically.
And some memory accesses would hit the L2 cache even though they have never been accessed before.
-gpgpu_perf_sim_memcpy 1 # Fill the L2 cache on memcpy
I found that this option is turned on defaultly in every configuration file, but I don't see why it's on by default. With it on, when moving data from host to device by cudaMempy, gpgpusim will fill the L2 cache automatically. And some memory accesses would hit the L2 cache even though they have never been accessed before.