Closed udaythummalapalli closed 5 years ago
Hi, thanks for the questions.
For profiling (Q1 & Q3), we recommend using PIX. I'm not sure what it’ll make of CUDA, but ultimately that all boils down to WDDM work submission so at least some amount of this should show up in timing captures. In addition to PIX we also have GPUView (which ships as part of the Windows Performance Toolkit) which is good at giving an all up system view of how the GPU is being used and shared across all processes so you can see exactly how the various workload are effectively sharing the GPU.
Q2: In general, CPU thread priority doesn’t impact GPU scheduling priorities.
There are ways to impact GPU scheduling priorities, but that’s typically better to be avoided by regular application. Windows will automatically boost any process that is in focus. In specific instances (like VR) there are mechanism for a process to ask for a global “realtime” priority on their work submission… but applications that do this are required to have admin privilege and have to be incredibly careful to avoid totally screwing up the user experience. For regular compute app there should be no reason for application to touch our default policies.
I'm new to ONNX runtime.
We have several CUDA projects, Open GL usage in our product. How would they interact with WinML's (ONNX) GPU based inferencing, like process prioritization? For example,
Process A (Above Normal Priority): OpenGL rendering + Real time processing with CUDA + WinML (ONNX) inference for model 1 Process B (Below Normal Priority): WinML (ONNX) inference for model 2 and model 3
Q1. In process A, how can I profile CUDA and ONNX usage together? nvprof? Visual studio profiler? Q2. Would Process A GPU usage takes priority over Process B ML usage? Q3. How can I profile Process B, ML usage?