soyuka / pidusage

Cross-platform process cpu % and memory usage of a PID
MIT License
512 stars 55 forks source link

Add a "useProcFiles" option and speed up performance by keeping fd open #65

Closed soyuka closed 6 years ago

soyuka commented 6 years ago

@simonepri I think that I did the bench thing correctly, looks like using procfiles is really faster:


  ✔ bench › should execute the benchmark (12.9s)
    ℹ 1 pid 100 times done in 1325.301 ms (75.455 op/s)
    ℹ (procfile) 1 pid 100 times done in 28.341 ms (3528.501 op/s)
    ℹ 2 pid 100 times done in 1299.832 ms (76.933 op/s)
    ℹ (procfile) 1 pid 100 times done in 52.198 ms (1915.772 op/s)
    ℹ 5 pid 100 times done in 1316.378 ms (75.966 op/s)
    ℹ (procfile) 5 pid 100 times done in 45.734 ms (2186.534 op/s)
    ℹ 10 pid 100 times done in 1354.110 ms (73.849 op/s)
    ℹ (procfile) 10 pid 100 times done in 50.611 ms (1975.843 op/s)
    ℹ 25 pid 100 times done in 1358.462 ms (73.613 op/s)
    ℹ (procfile) 25 pid 100 times done in 99.147 ms (1008.599 op/s)
    ℹ 50 pid 100 times done in 1500.427 ms (66.648 op/s)
    ℹ (procfile) 50 pid 100 times done in 144.820 ms (690.511 op/s)
    ℹ 100 pid 100 times done in 1513.935 ms (66.053 op/s)
    ℹ (procfile) 100 pid 100 times done in 245.986 ms (406.526 op/s)

May you check this?

@vmarchaud may fix #64 let me know if you can try the branch.

simonepri commented 6 years ago

If we can remove the usage of ps and it also faster It's cool. But I think we need to take our time to do it correctly.

Actually I was thinking about implementing the readings in c++ using procfiles and then just expose them to node. (A lot less overhead) But let's first define a coherent way to obtain the readings on all the platforms before diving in the implementation.

vmarchaud commented 6 years ago

@simonepri We can't use a module with native compilation with pm2 since its heavily used on a lot of platform

simonepri commented 6 years ago

@vmarchaud Yes the first thing to do is to implement everything in JS.

Then if the option is feasible, we could think a way to make a native cross-compatible implementation and then ship pre-built binaries. (But it's just an idea, it would require a lot of work) Anyway, it would just be a c file with some stdio operations (open/close file) nothing involving strange libraries or something like that.

soyuka commented 6 years ago

Anyway, it would just be a c file with some stdio operations (open/close file) nothing involving strange libraries or something like that.

:+1: thought of this as well, it could be an optional dependency to this project so that it would not interfere.

soyuka commented 6 years ago

If we can remove the usage of ps and it also faster It's cool. But let's first define a coherent way to obtain the readings on all the platforms before diving in the implementation.

Just like it was before:

I'm :+1: to keep an option that gives the ability to switch to using ps (if procfiles is the default) or to use procfiles (is ps is the default).

Btw I tested both values on debian:

ps: 7
procFiles: 7.407407407295623
ps: 4
procFiles: 3.80952380951325
ps: 2.6
procFiles: 2.893890675227075
ps: 2.2
procFiles: 2.4330900243219413
ps: 2
procFiles: 1.9531250000017764
ps: 1.6
procFiles: 1.631321370308712
ps: 3.2
procFiles: 3.1944444444392794
ps: 3
procFiles: 2.9161603888176724
ps: 2.7
procFiles: 2.7056277056304325
vmarchaud commented 6 years ago

I'm good to go forward with this option, so do we need to modify pm2 to use procfiles by default ?

simonepri commented 6 years ago

I'm 👍 to keep an option that gives the ability to switch to using ps (if procfiles is the default) or to use procfiles (is ps is the default).

Mhh, IMO making it configurable just adds complexity for the final user. If we are facing difficulties trying to understand which solution is "correct way" (although, there isn't a really correct way) I cannot imagine an end user that uses this package figuring out which of the two is better. 🙈

So I believe that we should choose a way and stick with it.

The factors that we can take into account for the decision could be:

soyuka commented 6 years ago

Speed (How many readings we can take in a second?)

Looks like procfiles are much much faster since I'm keeping fd's open and not using readFile

Meaning (What is the meaning of the readings taken?)

As shown, when it comes to monitoring, procfiles values are much more promising if you want to know "how much cpu was used in the last X seconds". At least this is what @vmarchaud observed.

Portability (This method works well on a lot of platforms?)

Because os x has no procfiles, we'll keep using ps until we can find a way to add some sycalls.

Coherence (We do preserve the meaning of the reading across platform and/or various solution on the same platform?)

According to my recent tests, readings are inconsistent between ps and procfile if we use our own history. Indeed values will be closer to what top does (cf stackoverflow post on the subject). What you said here: https://github.com/soyuka/pidusage/issues/64#issuecomment-396933160

Ease of use (How much knowledge is required to use the API?)

We need to document that ps pcpu is percent cpu usage since process started. We can also document that procfiles calls are "usage since last call".

simonepri commented 6 years ago

We need to document that ps pcpu is percent cpu usage since process started. We can also document that procfiles calls are "usage since last call".

We can obtain the same meaning using ps on macOS using etime and stime. We have to try to guarantee the same meaning across platform if possible.

soyuka commented 6 years ago

:+1: just saw your comment on the issue!

vmarchaud commented 6 years ago

As shown, when it comes to monitoring, procfiles values are much more promising if you want to know "how much cpu was used in the last X seconds". At least this is what @vmarchaud observed.

Actually that's we want for pm2, it of course depend on the use case. but since we are aggregating value over time we keymetrics it's useless for us to get a "realtime" metric that is already aggregated.

Mhh, IMO making it configurable just adds complexity for the final user.

I don't believe having well documented options and let the user decide is complex. Every use-case deserve its options IMO

simonepri commented 6 years ago

it's useless for us to get a "realtime" metric that is already aggregated.

But the CPU percentage is an aggregated info by definition. If you want a real non-aggregated reading for keymetrics you should useetime and stime values and then compute the percentage on your side.

codecov-io commented 6 years ago

Codecov Report

Merging #65 into master will decrease coverage by 1.86%. The diff coverage is 86.66%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #65      +/-   ##
==========================================
- Coverage   92.57%   90.71%   -1.87%     
==========================================
  Files           9        9              
  Lines         283      334      +51     
==========================================
+ Hits          262      303      +41     
- Misses         21       31      +10
Impacted Files Coverage Δ
lib/ps.js 95.91% <100%> (+0.56%) :arrow_up:
lib/helpers/parallel.js 100% <100%> (ø) :arrow_up:
lib/procfile.js 87.3% <82.6%> (-12.7%) :arrow_down:
lib/stats.js 90.9% <87.5%> (-1.95%) :arrow_down:
lib/history.js 94.87% <90%> (-2.01%) :arrow_down:

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update f72ec95...d558d86. Read the comment docs.

vmarchaud commented 6 years ago

But the CPU percentage is an aggregated info by definition.

right, the CPU percentage can't be sended without being aggregated. But there is a difference between an average of 5 seconds and the whole lifetime of the process.

If you want a real non-aggregated reading for keymetrics you should useetime and stime values and then compute the percentage on your side.

Actually pidusage mainly exist because we needed it for PM2, i don't want to force anyone to implement it a specific way, but if we need to implement it on our side, we will just fork pidusage to works the way we want.

simonepri commented 6 years ago

@soyuka what you think about the chance of reporting both etime and stime inside the output object instead of an aggregated ctime value?

simonepri commented 6 years ago

Actually pidusage mainly exist because we needed it for PM2, i don't want to force anyone to implement it a specific way, but if we need to implement it on our side, we will just fork pidusage to works the way we want.

Yeah, but if possible it's better to keep a single and well working module.

right, the CPU percentage can't be sended without being aggregated. But there is a difference between an average of 5 seconds and the whole lifetime of the process.

Yeah, actually the current behaviour is that it depends on how frequently you perform the reading. If you perform the reading 1 time/second then you'll obtain the usage in the last second. The maxage of a reading in the past is 60 seconds. So if you perform a measure 1 time each 5 minutes actually the reading is took twice and you get the cpu usage in the last couple of ms. (The maxage is configurable)

soyuka commented 6 years ago

@soyuka what you think about the chance of reporting both etime and stime inside the output object instead of an aggregated ctime value?

It's hard right now because we tried to make ps work the same on linux and osx (remember the different time formats and the non-presence of some options). If we say that ps now works only for osx me may improve the ps part a lot!

I'll think about that though, I need to read back all our discussions in the last big changes you did!

simonepri commented 6 years ago

That's how I imagine we could give the output in the future:

{
  proc: {
    ppid: 312,         // PPID
    pid: 727,          // PID
    uptime: 6650000,   // ms since the start of the process,
    thcount: 2         // number of threads used by the process
  }
  cpu: {
    pcpu: 15.0,        // percentage since last call (from 0 to 100*vcore)
    utime: 867000,     // user time in ms
    stime: 867000,     // kernel/system time in ms
  },
  mem: {
    pmem: 34.3,        // percentage (from 0 to 100)
    rss: 357306368,    // resident set size in bytes
    vsz: 957306368,    // virtual memory size in bytes
  },
  timestamp: 864000000  // ms since epoch in which the reading has been taken
}

(Probably is better to remove the nested objects)

@soyuka @vmarchaud any suggestion about that?

soyuka commented 6 years ago

That's how I imagine we could give the output in the future

As long as it's backward compatible I see no harm, we're already computing these values so why not!

Should we keep the pcpu since start time?

simonepri commented 6 years ago

Should we keep the pcpu since start time?

I think that the user expects the other way around. If they need the percentage since the start they can compute it by just doing:

(stime + utime) / etime

soyuka commented 6 years ago

It feels wrong to add the again test to ps, it's also slow enough I'm really not sure I want to slow it more :smile:

simonepri commented 6 years ago

It's just the first time The reading should be correct, no matter if it takes a bit longer.

soyuka commented 6 years ago

It's a bit more complicated then that because we allow multiple pids so we need to recompute all of them :|.