Closed cameron-martin closed 4 months ago
@cameron-martin
but changing the device to CPU
dyno gputrace
only supports tracing Pytorch executing on GPU
cc: @briancoutinho to confirm
Why does it require a GPU if both dynolog and kineto claim to support CPU profiling?
I looks like libkineto_init handles cupti not being available gracefully. I'm failing to see what is causing this to fail, but I'll keep digging.
@cameron-martin Yes, actually dynolog and kineto support CPU-only profiling too. Tbh we didn't test the on-demand tracing flow on a pure CPU version of PyTorch. I'll give it a try with a latest release of cpu torch, but please do share the versions you used as well.
You are right that libkineto_init() is where the actual registry happens with dynolog. The libkineto_init()
is invoked in multiple places.
Actually we wanted (1) to always be invoked but i think some define is probably compiling it out like this ENABLE_GLOBAL_OBSERVER
thing.
Any chance you are using an Apple system or PyTorch edge?
#if defined(__APPLE__) || defined(EDGE_PROFILER_USE_KINETO)
#define ENABLE_GLOBAL_OBSERVER (0)
#else
I'm using:
I also compiled HEAD of dynolog and tested but still get the same results.
@cameron-martin thanks for the info, I think the issue is in the PyTorch side as in libkineto is not getting initialized. I'll try to repro the CPU only setup and get back to you.
Working on two WIP fixes https://github.com/pytorch/kineto/pull/861 (have to land this first) https://github.com/pytorch/pytorch/pull/118320
Ok all fixes are in https://github.com/pytorch/pytorch/pull/118320 :) Can you try out PyTorch nightly build to see if this works, I tried it out for developing the PR. Let me know..
Just tested this with nightly torch and it works great, thanks!
After setting up dynolog to
--enable_ipc_monitor
, I have tried running the example (but changing the device to CPU) like so:Then if I run
dyno gputrace
, I get the following:If I wrap the example in a profiler, then I do get matched processes. However, I thought the point what that it required no code modifications?
This does create a trace, but it doesn't contain much useful info. See attached file.
dynolog.json
What am I doing wrong?