davidmarkclements / 0x

🔥 single-command flamegraph profiling 🔥
MIT License
3.21k stars 103 forks source link

0x still generating tick output despite --quiet and --silent, in a container #274

Closed a-roberts closed 1 year ago

a-roberts commented 1 year ago

Hi there, I'm looking to do some profiling for an enormous Node.js app with many different libraries and 0x looks to be an awesome tool to help me easily do exactly that*.

To cut a long story short I am running 0x inside of a container on OpenShift and I have a custom lifecycle/preExit hook that should help me get the output file out for analysis, later.

I'm quite certain this is down to my own user error, but I can't figure it out and hence this ask for help...

In my image I have installed top and ps through microdnf, as well as 0x, globally.

If I get some extra wide ps output, I can see this is happening in my container:

sh-4.4$ ps -aef ww
UID          PID    PPID  C STIME TTY      STAT   TIME CMD
1000690+       1       0 48 14:27 ?        Rsl    0:34 node /opt/app-root/src/.npm-global/bin//0x --quiet --silent --collect-only --output-dir /tmp/profiling -- node --dns-result-order=ipv4first server.js
1000690+      14       1 50 14:27 ?        Sl     0:35 /usr/bin/node --prof --logfile=/opt/ibm/app/%p-v8.log --print-opt-source -r /opt/app-root/src/.npm-global/lib/node_modules/0x/lib/preload/no-cluster -r /opt/app-root/src/.npm-global/lib/node_modules/0x/lib/preload/redir-stdout -r /opt/app-root/src/.npm-global/lib/node_modules/0x/lib/preload/soft-exit --dns-result-order=ipv4first server.js
1000690+      23       0  0 14:28 pts/0    Ss     0:00 sh
1000690+      36      23  0 14:28 pts/0    R+     0:00 ps -aef ww

(as you can see, there are two node processes - presumably based on the parent process ID for PID 14, that's the child, and 0x is supposed to generate that node --prof process). What you might also notice though is that --prof doesn't have any flags; presumably 0x is supposed to "chew up" and consume the output from that, if --quiet and --silent is set.

The output, however, is still present, and neverending (because why wouldn't it be, I've not sent a sighup yet).

It's a fast way to cause a pod to be evicted and I see ticks, such as this, until etcd eventually decides what I'm doing isn't right.

tick,0x7f23b0272a5f,586821,0,0x0,3,0x5631cd9824a0,0x39a54e5faa4d,0x39a54e5fa76a,0x39a54e5f9070,0x39a54e5f53a9
tick,0x7f23b0272a5f,586884,0,0x0,3,0x5631cd9824a0,0x39a54e5faa4d,0x39a54e5fa76a,0x39a54e5f9070,0x39a54e5f53a9
tick,0x7f23b0272a5f,586959,0,0x0,3,0x5631cd9824a0,0x39a54e5faa4d,0x39a54e5fa76a,0x39a54e5f9070,0x39a54e5f53a9
tick,0x7f23b0272a5f,587020,0,0x0,3,0x5631cd9824a0,0x39a54e5faa4d,0x39a54e5fa76a,0x39a54e5f9070,0x39a54e5f53a9

If I docker runthe same image I'm using in my pod (through a simple docker run -it (the image) sh), my process starts as expected with profiling (indicating 0x is working great, nice and quiet until I do a Ctrl-C).

I can't share the full image sadly since what I'm working on isn't open-source, but any thoughts would be appreciated.

The entrypoint to my application is as follows:

exec env 0x --quiet --silent --collect-only --output-dir /tmp/profiling -- node --dns-result-order=ipv4first server.js

What might I be missing, please?


*For a little background and rationale, I would use --prof based on the simple profiling blog I've seen, but AFAIK that has no customisable output dir - although --cpu-prof might do the same thing and does allow a directory override. It's not easy for me to write to the current directory, short of tweaking my Dockerfile a bit, hence why I've gone with using 0x.

Note: if I find the solution before anyone else, I shall reply to myself with it, and close, for anyone else who stumbles upon this. Thanks in advance to anyone who stumbles upon this!

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

github-actions[bot] commented 1 year ago

This issue was closed because it has been stalled for 5 days with no activity.