con / duct

A helper to run a command, capture stdout/stderr and details about running
MIT License
1 stars 1 forks source link

Question regarding the intended behavior of `update_max_resources()` for "total" subdicts #98

Open jwodder opened 2 weeks ago

jwodder commented 2 weeks ago

How exactly is Report.update_max_resources() intended to behave regarding the "total" elements of its input dicts? Specifically, is the output "total" supposed to contain the maximum of the total.pcpu & total.pmem inputs, as currently happens, or is it supposed to contain the sum of the pcpu & pmem values of the "maxed" stats? I could see it going either way, and I can't tell whether the current behavior is in error.

For a concrete example, consider a maxes input with the following fields:

PID pcpu
1 23%
2 5%
total 28%

and a samples input with the following fields:

PID pcpu
1 7%
2 42%
total 49%

Currently, update_max_resources will produce the following output:

PID pcpu
1 23%
2 42%
total 49%

Is 49% the desired total value, or do we want it to be 65% instead?

CC @asmacdo

asmacdo commented 2 weeks ago

Its supposed to be the highest "spike" (should be current behavior). I figure its more helpful to know the highest total spike than it is to add the spikes of each PID (collected at different times).

I think this will make more sense when we are reporting absolute memory values rather than percentage. https://github.com/con/duct/issues/63

The use case here is that scientist 1 runs an execution with duct and collects the data. scientist 2 wants to reexecute and has useful info about what resources they will need. In this case, I suppose that 49% is more helpful than 65%, but I'm open to collecting whatever data will be most helpful.