Closed simonepri closed 6 years ago
A PR would be great as long as it keeps a backward compatibility I'm all in! Lmk if you don't have the time to work on it ;)
I'll try to do it today. I'm working on a profiler tool and without it I can't reach high sampling rates.
@soyuka The API may look like this:
// Current Behaviour
pusage.stat(pid, console.log);
// => {cpu, memory, time, start};
// Behaviour if an array is provided
pusage.stat([pid1, pid2, ..., pidN), console.log);
// => {
// pid1: {cpu, memory, time, start},
// pid2: {cpu, memory, time, start},
// [...]
// pidN: {cpu, memory, time, start},
// }
What do you think?
ps
based reads:https://github.com/soyuka/pidusage/blob/master/lib/stats.js#L111
if (Array.isArray(pid)) {
pid = pid.map(p => parseInt(p, 10)).join(',');
} else {
pid = parseInt(pid, 10)
}
And then parse the stdout
proc
based reads:https://github.com/soyuka/pidusage/blob/master/lib/stats.js#L49
We need to read /proc/pid/stat
file for each pid.
Is there a better way to do this?
wmic
based reads:https://github.com/soyuka/pidusage/blob/master/lib/stats.js#L157
I never used wmic
but if that is correct, I think that we can do something like:
if (Array.isArray(pid)) {
pid = pid.map(p => `ProcessID=${parseInt(p, 10)}`).join(' or ');
} else {
pid = parseInt(pid, 10)
}
var arg = `PROCESS where "${pid}" get workingsetsize,usermodetime,kernelmodetime`
And then parse the stdout
your profiler looks sick!
API:
pusage.stat(integer | integer[], callback)
With one integer we keep the same response indeed. I'm not sure about the multiple pids answer, won't an array of statistics be better then an object for processing?
yup alright
We need to read /proc/pid/stat file for each pid. Is there a better way to do this?
Exactly my though when I looked into this. Reading proc files is one of the most precise thing we can do and I'd like to keep it. So nope, there's no better way of doing it. We need to read every proc file. We may need to introduce an asynchronous utility module to handle the calls?
Haha it should work, I've only made assumptions with it though can't find a proper documentation for this... I'll try the patch manually on a windows device but your change looks fine!
Some notes though:
Thank you for the notes.
Why if you need to support node 0.12
it's not present inside the travis.yml
file?
Also inside the appveyor.yml
tests aren't run for node > 5
I'll issue a PR for that too.
Anyway what about shipping a version 2.0.0 in the future that uses promises instead of callbacks?
I'm not sure about the multiple pids answer, won't an array of statistics be better then an object for processing?
I feel the object based version more user friendly. What are the advantages are you thinking about?
I meant node 4 my bad. Not sure arrow functions were supported but anyway we don't need them. Same for template strings for now.
I should mention this somewhere in this project's contributing file that I wish we stick with the rules of Tiny Guide To Non Fancy Node here (link is to callback/promise stuff). If you want promises I could release a pidusage-promisify
bridge.
Also inside the appveyor.yml tests aren't run for node > 5
They should indeed!
For the workflow we could add a dependency on https://github.com/feross/run-parallel or maybe run-auto to readFile proc files.
If you want promises I could release a pidusage-promisify bridge.
Actually I use pify
I don't think a new package to just promisify it is necessary.
pify(pidusage.stat)(pid)
I feel the object based version more user friendly. What are the advantages you are thinking about?
Usually when I get multiple results I want an array with a length etc. Obviously a Map would be even nicer. An object may be user friendly, especially that we know that the PID is unique but I feel that most of the use cases would have to transform it to an array (for example chart libraries always require an array and you would Object.keys(stats).map(e => stats[e])
).
pify(pidusage.stat)(pid)
Yes anyway having a module that does the new Promise()
would be faster at runtime because it'd skip the boilerplate of pify. It's micro optimization but in this library he plays a lot with arguments
and it's a performance bottleneck.
With the array we must keep the results in the same order of the input array. But if one of the provided pids wasn't been read we need to keep undefined values in the array.
My usage would be:
async function read() {
let procl = [{ppid: 'somepid', pid: 'somepid'}];
try {
procl = procl.concat(await descendant('somepid'));
} catch (err) {}
try {
const results = await pify(pidusage.stat)(procl.map(proc => proc.pid));
procl.map(proc => Object.assign(proc, results[proc.pid])); // If As Object
procl.map((proc, i) => Object.assign(proc, results[i])); // If As Array
// => [
// {"ppid": "XXXX", "pid": "YYYY", "cpu": Z, "memory": M},
// [...]
// {"ppid": "XXXX", "pid": "YYYY", "cpu": Z, "memory": M},
// ]
} catch(err) {}
}
mhh okay I get why an object would make more sense in your use case. An object it is then :).
Actually for my usage case is the same. With the array it would be even faster I think.
Anyway we must define what happen if one of the readings can't be taken. (For instance the pid provided is not in the system) I don't think that a fail-fast approach is suitable. I would rather prefer to have partial results. We need to add a flag to support both behaviours?
What's the current behaviour if the provided pid cannot be read?
Actually for my usage case is the same. With the array it would be even faster I think.
I'd prefer an array, obviously a pid
property would be added to the current statistics object.
What's the current behaviour if the provided pid cannot be read?
The first argument of the callback is an error. Statistics are null
.
I don't think that a fail-fast approach is suitable. I would rather prefer to have partial results.
Indeed, IMO we don't want the same behavior when issuing multiple requests and the code will need some refactoring to support that properly. I'm :+1: for partial results. I don't think we need to support both behaviors. For error management I'm open to suggestions, my first thought would be to do something like AggregateError.
So your suggestion is to pass both error
and statistics
to the callback?
Isn't that weird?
I'm really not sure how to handle this. It's weird but on the other hand if you give partial results, you have partial errors no?
Btw the callback is always like this:
stat(pid, function (err, stat) {
// err is null when no error occured
})
Although, the promise will reject
with what I had in mind so that's probably the wrong approach :|.
The right way would be to implement an EventEmitter and to send error events. Pidusage isn't built with an EE in mind but maybe we could. WDYT?
btw we can talk on irc or slack if it makes things easier.
An event emitter a seems a reasonable solution but is not backward compatible.
Creating an event emitter for every stat
call seems a bit too much anyway :|. I think we can make it backward compatible though but it'll be less performant.
A third solution I had in mind would be to have an error
property on the stat but that's kinda dirty.
Or a fourth solution would be to use an onerror
option (function signature) that's called on error. Also kinda dirty.
For now, let's do this when an array of pids is given:
{strict: true}
is given and an error occurs use a fail-first
approachI think that this is the easiest implementation.
Proposal
With the current implementation it's not possible to retrieve info for more than one pid at once. I've looked the code and it does not seems to complicate to support a list of pids. Would you consider a PR to add this?
Use Case
I need to monitor the whole tree and the readings would be a lot faster if I can retrieve info for a list of pids with a single
ps
command.Currently my solution is to do multiple calls to pidusage:
But with a lot of heavy loaded processes it can be really slow due to context switching.