libcudnn seems not to call cudaLaunchKernel in GPGPU-Sim.

I’ve tried to run cudnn_samples_v7 with GPGPU-Sim, but its cuDNN kernels does not run on GPGPU-Sim. It makes following message when "g_debug_execution = 3".

GPGPU-Sim PTX: CUDA API function "cudaError_t cudaMemcpy(void*, const void*, size_t, cudaMemcpyKind)" has been called.
GPGPU-Sim PTX: cudaMemcpy(): devPtr = 0xc01a5300
GPGPU-Sim API: Stream Manager State
GPGPU-Sim API:    stream 0 has 1 operations
GPGPU-Sim API:       0 :  stream operation memcpy host-to-device
GPGPU-Sim: ** START simulation thread (detected work) **
GPGPU-Sim API: Stream Manager State
GPGPU-Sim API:    stream 0 has 1 operations
GPGPU-Sim API:       0 :  stream operation memcpy host-to-device
GPGPU-Sim API: stream 0 performing memcpy host-to-device
GPGPU-Sim PTX: copying 3136 bytes from CPU[0x7fffdc43aa00] to GPU[0xc01a5300] ...  done.
GPGPU-Sim: ** STOP simulation thread (no work) **
GPGPU-Sim: *** simulation thread starting and spinning waiting for work ***
Testing cudnnGetConvolutionForwardAlgorithm ...
Fastest algorithm is Algo 1
Testing cudnnFindConvolutionForwardAlgorithm ...
^^^^ CUDNN_STATUS_SUCCESS for Algo 0: 0.020256 time requiring 0 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 1: 0.029696 time requiring 3464 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 2: 0.037888 time requiring 57600 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 4: 0.070240 time requiring 207360 memory
^^^^ CUDNN_STATUS_SUCCESS for Algo 7: 0.072352 time requiring 2057744 memory

GPGPU-Sim PTX: CUDA API function "cudaError_t cudaMalloc(void**, size_t)" has been called.
GPGPU-Sim PTX: allocating 46080 bytes on GPU starting at address 0xc01a6000
GPGPU-Sim PTX: cudaMallocing 46080 bytes starting at 0xc01a6000..

cudaLaunchKernel function should be called after “Testing cudnnFindConvolutionForwardAlgorithm”, but it has never been called.

On the other hand, if I try to run CUDA sample(such as vectorAdd), it does work well.

I guess that my cuDNN library does not execute cudaLaunchKernel function in GPGPU-Sim ‘libcudart.so'. It seems to call cudaLaunchKernel in original 'libcudart.so'.

gpgpu-sim / gpgpu-sim_distribution

libcudnn seems not to call cudaLaunchKernel in GPGPU-Sim. #113