Open echoGee opened 6 years ago
@echoGee did you check "Enable concurrent kernel profiling" checkbox in Settings tab in Nvidia Visual Profiler?
@tarkook : I see its reference in http://docs.nvidia.com/cuda/profiler-users-guide/index.html. I could not find that option in the Visual profiler. Any pointers?
System information (version)
Detailed description
Using streams to run multiple different convolutions do not run concurrently . I have not used any compiler flags such as
–default-stream per-thread
Steps to reproduce
The code is in a .cpp file.