nextflow-io / nf-co2footprint

[WIP] A Nextflow plugin to estimate the CO2 footprint of pipeline runs.
https://nextflow-io.github.io/nf-co2footprint/
Apache License 2.0
11 stars 4 forks source link

Update Report - Add energy consumption plots and add values to the table #12

Closed mirpedrol closed 1 year ago

mirpedrol commented 1 year ago

Follow up from #9 Add energy consumption plot

New plots (conditional tickformat): image Example table with converted units: image Example table with raw values: image

mirpedrol commented 1 year ago

@skrakau I have added a function to convert the values to more readable units, adding kilo, mega, mili... for the trace files (see example in https://github.com/qbic-pipelines/nf-co2footprint/pull/9), do we want to do the same for plots in the report? I think it can be confusing if we output a different unit every time if people want to compare different runs, but on the other hand if values are too close to 0 we won't be able to see the differences.

skrakau commented 1 year ago

Hi @mirpedrol , for the memory plots in the standard execution report, for example, the units are adjusted for each run as well, so this shouldn't be a problem as far as I see.

mirpedrol commented 1 year ago

I am unsure about rounding the values, they are now rounded for the "Raw" plots. Should we also do it for "Readable Units" in the table?

skrakau commented 1 year ago

One plot with adjusted y axis labels should be sufficient.

skrakau commented 1 year ago

Some notes already regarding the tasks table:

skrakau commented 1 year ago

A few notes

would the following help?

mirpedrol commented 1 year ago

Steps after in person discussions:

  1. Convert all values to mili (raw) and round. For the table convert to readable units.
  2. Display lower values with scientific notation in the y axis. (check after using all values in mili)
  3. Also modify the tickformat. (maybe)
  4. Check if we can add the unit in the plot ticks, what happens if we put miligram instead of gram?
mirpedrol commented 1 year ago

If the maximum value is 4 or lower than 4, we use a tickformat of .2f (displays the values as decimals 0.5, etc), if it's higher than 4 (or some other higher reasonable number) we use .2s (adds the units to the tick value 1k, etc). This way we can be sure that for lower values we will never have a tick with 500m, as the maximum number of displayed ticks is 5.

skrakau commented 1 year ago

4 because, the default nticks value is 5, leading to max. 5 ticks. Thus with values >= 4, the min. tick is at >= 1.0 (g).

skrakau commented 1 year ago

If you think it's better this way, we can also keep it like this

mirpedrol commented 1 year ago

Using .1f sounds good but I don't like .4s because I think it's confusing with thousands. image I can't find a pipeline to test with big numbers but the exmaple plots look like this (values around 3 milion) image

skrakau commented 1 year ago

I see your point, I guess the reason for the execution plots is that this it allows more accurate values, in particular for the displayed boxplot values