Open spuyet opened 5 years ago
Yep, I don't agree with the two lines per process though, I'm the one who asked for one because we have a lot in production and it means you can see twice as much in one console screen.
Also @w3st3ry you (still) didn't take into account half of the things I asked in my last feedback (which I can't see any more because I sent it via PM in Slack), I gave you the benefit of the doubt but I guess by now I can consider you will never do them. This is not acceptable, you can't waste our time like that. You need to at least read what we say, and if you don't want to do something it must be written somewhere, with arguments, you can't just ignore.
So OK I'll spend some time again to try to give you (mostly) the same feedback, please do it this time:
/proc/stat
of course we shouldExample output (made quickly by hand):
$ puma-helper status
Version: 1.0.0 | Date: Mon, 06 May 2019 17:19:42 +0200
38423 answers_integration Phase: 28 Active threads: 0/2 [░░]
└ 27771 CPU: 98.4% Mem: 1230MiB Uptime: 9d 3h Threads: 0/1 [░]
└ 27949 CPU: 0.2% Mem: 230MiB Uptime: 11m 29d 23h Threads: 0/1 [░] (Last checkin: 13s) (Phase: 27)
38424 answers_integration API Phase: 28 Active threads: 0/2 [░░]
└ 27892 CPU: 0.1% Mem: 172MiB Uptime: 11m 29d 23h Threads: 0/1 [░]
└ 27961 CPU: 0.1% Mem: 171MiB Uptime: 11m 29d 23h Threads: 0/1 [░]
WDYT @spuyet @ylecuyer?
This output looks better ;)
I would like to add something about errors too. Currently errors are on the top of the output
$ puma-helper status
WARN[May 6 17:19:42] [lds] configuration is invalid. Error: Get http://unix/stats?token=768935b7ed117cfac6c8c99da4db6bd: dial unix /tmp/puma-status-1545126439210-2886: connect: no such file or directory
---------- Global informations ----------
Version: 1.0.0 | Date: Mon, 06 May 2019 17:19:42 +0200
We should have them at the end if any otherwise it is difficult to notice them
$ puma-helper status
Version: 1.0.0 | Date: Mon, 06 May 2019 17:19:42 +0200
38423 answers_integration Phase: 28 Active threads: 0/2 [░░]
└ 27771 CPU: 98.4% Mem: 1230MiB Uptime: 9d 3h Threads: 0/1 [░]
└ 27949 CPU: 0.2% Mem: 230MiB Uptime: 11m 29d 23h Threads: 0/1 [░] (Last checkin: 13s) (Phase: 27)
38424 answers_integration API Phase: 28 Active threads: 0/2 [░░]
└ 27892 CPU: 0.1% Mem: 172MiB Uptime: 11m 29d 23h Threads: 0/1 [░]
└ 27961 CPU: 0.1% Mem: 171MiB Uptime: 11m 29d 23h Threads: 0/1 [░]
[lds] configuration is invalid. Error: Get http://unix/stats?token=768935b7ed117cfac6c8c99da4db6bd: dial unix /tmp/puma-status-1545126439210-2886: connect: no such file or directory
$ puma-helper status
Version: 1.0.0 | Date: Mon, 06 May 2019 17:19:42 +0200
38423 answers_integration Load: 0/2 [░░]
└ 27949 CPU: 0.2% Mem: 230M Uptime: 11m 29d 23h Load: [░] (Last checkin: 13s) (Phase 27 != Phase app 28)
└ 27771 CPU: 98.4% Mem: 1230M Uptime: 9d 3h Load: [░]
38424 answers_integration_api Queuing: 42 Load: 2/2 [░░]
└ 27892 CPU: 99.8% Mem: 172M Uptime: 11m 29d 23h Load: [░]
└ 27961 CPU: 100.0% Mem: 171M Uptime: 11m 29d 23h Load: [░]
[lds] configuration is invalid. Error: Get http://unix/stats?token=768935b7ed117cfac6c8c99da4db6bd: dial unix /tmp/puma-status-1545126439210-2886: connect: no such file or directory
Here is a little update:
About CPU percentage, I will test to do something with /proc/PID/stat
to see if we can have something reliable
Queuing is at worker level, no? not application?
It's at worker level in puma stats yes, but It seems more interesting to me have a queuing overview and then to aggregate the value and display it at application level. WDYT ?
Yep ok
In order, to summarize all these changes:
Questions before adding in todo-list:
About the CPU percentage, what's the relation with go routines and locks? you can iterate on all apps to get the first number then wait 1 second, and then get the second number for all apps, and then print. You don't have to over-engineer this.
I just gave this kind of solution in first because it's more compliant with the philosophy of the language. BTW after some change it should works like you said too and it's more easy to maintain for someone is not a full-time Gopher, I agree.
Otherwise, did you agree about the summary? @jarthod @spuyet
A couple typos but otherwise, yes:
Last checkin disabled if there's an issue (> 10 sec)
It's the opposite: visible if there's an issue
MiB to B
→ to M
About the queue I think it's "backlog" in the status json but I let @spuyet confirm.
About "Removed threads number per process" I think @spuyet said we should remove the number but keep the load bar graph for worker (because it's almost always 1) but I'm not sure it's a good idea as we can have multiple threads in which case it'll not be 1. @spuyet WDYT?
"Added queuing" -> How I should get this information? I get nothing according to queuing in the worker Json, could you explain?
I've explained it in the body of this issue, that's the backlog
value that you have per process in status json output yeah.
About "Removed threads number per process" I think @spuyet said we should remove the number but keep the load bar graph for worker (because it's almost always 1) but I'm not sure it's a good idea as we can have multiple threads in which case it'll not be 1. @spuyet WDYT?
Yep, I don't find the number really useful as we already have the global value, the load graph seems to be enough to me.
Ok!
Please checkout https://github.com/dimelo/puma-helper/pull/21 about the UI. Available on staging env.
I think that we should rework the
puma-helper
output, the current output is really hard to read and we should usepassenger-status
as example:VS
Here is a first draft of something a bit readable and largely inspired from the passenger output:
I know that the go dependency that you're using doesn't support current CPU percentage and to be honest portability is a welcome feature but current CPU usage is a mandatory one IMHO, a software cannot be restricted only because of portability mostly for a tiny project like this one.
This is a first draft, comments and improvements are welcome.
Notes: CPU percentage => use /proc/PID/stat Uptime => use /proc/PID/stat too Queued => corresponding to the backlog value per process in puma stats