facebookincubator / dynolog

Dynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the linux kernel, CPU, disks, Intel PT, GPUs etc. Dynolog also integrates with pytorch and can trigger traces for distributed training applications.
MIT License
227 stars 34 forks source link

Can I run dynolog in a container? #160

Closed yaning223 closed 8 months ago

yaning223 commented 1 year ago

I can generate packages in a container, copy them from the container and run them on my host. I wonder if I can directly run dynolog in a container? However, the command "sudo systemctl start dynolog" is hard to run in the container.

briancoutinho commented 12 months ago

@yaning223 sorry about the late response. Within docker containers you do not have systemctl so that doesn't wrok. You should be able to run it directly in the container for sure.

You could do something like this in your docker file

FROM --platform=linux/amd64 amd64/ubuntu:20.04

# install dynolog
RUN wget https://github.com/facebookincubator/dynolog/releases/download/v0.2.2/dynolog_0.2.2-0-amd64.deb
RUN dpkg -i dynolog_0.2.2-0-amd64.deb

RUN dynolog <add flags or you can COPY a flagsfile as well>

The flags reference is in the README.md. Could you share how you want to use dynolog, we can help with more details

Here is one example useful for running CUDA applications https://gist.github.com/anupambhatnagar/10b64cfab72145cbad33696332f5a1c7

yaning223 commented 11 months ago

Thanks for the reply! I want to run a PyTorch program while run dynolog to trace the program in a container (both dynolog and the program is running in one container), how can I realize that? For example, can I use the pid which is only for the container as the pid flag?

briancoutinho commented 11 months ago

@yaning223 sorry about the delay, you can run dynolog as a background process in you container. Just install the rpm in the docker file(showed in the link above).. Now you can either use tmux or put dynolog to background dynolog --enable-ipc-monitor Or dynolog --enable-ipc-monitor 2>&1 > /tmp/dynolog.log &

Then you can run your PyTorch program, and use dyno gputrace command. If you run all 3 - dynolog, PyTorch, dyno gputrac - in the docker container it should work. Could you give that a try. cc @anupambhatnagar who has used this flow

anupambhatnagar commented 11 months ago

@yaning223 here's the docker file which you can use to build the docker container. We have tested xor.py in this container using the flow @briancoutinho mentioned. https://github.com/facebookincubator/dynolog/blob/main/dynolog_hta.dockerfile

here's a small cheatsheet with commands to help you build and run the docker image. https://gist.github.com/anupambhatnagar/07ebff374bc45e4b63eb42893cca7e87

yaning223 commented 11 months ago

It works now! Thank you so much for your patient guidance!