Closed stricklandye closed 7 months ago
Hi @stricklandye for the docker/HTA demo ah dynolog is not configured to emit the metrics. Try adding these flags and you can also configure the logging/measurement interval
dynolog --enable-perf-monitor --use_JSON --kernel_monitor_reporting_interval_s 10 --perf_monitor_reporting_interval_s 10
Let us know if this works. Flag reference = https://github.com/facebookincubator/dynolog/blob/main/dynolog/src/Main.cpp#L44
PS: There is an early stage support for Prometheus too, see the test plan in the PR below for instructions, you may need to merge the docker file commands from this test plan with yours above. https://github.com/facebookincubator/dynolog/pull/181
PS: open ssl is probably getting used by a dependency we use ~cpr, will get back on it.
Sorry for not reply in time, using --use_JSON
works well. I will close this issue.
@briancoutinho Hi, I have also some questions:
KINETO_USE_DAEMON =1
and dynolog --enable_ipc_monitor
are required to trace GPU. What is the right way to trace GPU inside a Kubernetes Cluster? If running dynolog as daemon service in host, it seems cannot trace AI program runs inside a container and It's not feasible to bundle dynolog with every containter images either.Hi,
We haven’t tried this on docker containers actually. It does work if dynolog runs as root and we use containers based on Linux cgroups. Inside Meta we have a docker like thing (twine) Maybe docker uses cgroups too but needs some special permission setting. How about filing an issue for “support dynolog tracing on docker” and will do some research on it.
Actually am out on vacation for few weeks, someone from my team/Meta could help out:)
Best, Brian
On Tue, Nov 21, 2023 at 1:26 AM strickland @.***> wrote:
@briancoutinho https://github.com/briancoutinho Hi, I have also several questions:
- the doc says the KINETO_USE_DAEMON =1 and dynolog --enable_ipc_monitor is required to trace GPU. What is the right way to trace GPU inside a Kubernetes Cluster? If running dynolog as daemon service, it seems cannot trace AI program runs inside a container and It's not feasible to bundle dynolog with every containter images.
- Is the performance overhead of dynolog GPU trace high?
— Reply to this email directly, view it on GitHub https://github.com/facebookincubator/dynolog/issues/183#issuecomment-1818971510, or unsubscribe https://github.com/notifications/unsubscribe-auth/ABUZ7ZGWUBEUSITWNTUVJXDYFND7VAVCNFSM6AAAAAA7BDA6JCVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTQMJYHE3TCNJRGA . You are receiving this because you were mentioned.Message ID: @.***>
I see. Have a great time :D
Hi there. Why can't dynolog capture the CPU metrics and there is no log in
/var/log/dynolog.log
. Is there something wrong with the way I use it?How to Reproduce
By the way, I have also tried to run dynolog in host but dynolog is not compatible with the openssl that already installed(openssl 3.0.2). It seems that dynolog requires openssl 1.1.1, however it will no longer be maintained soon (according to openssl doc). So I think it is better to use the newer openSSL.