Update Report - Add energy consumption plots and add values to the table

mirpedrol commented 1 year ago

Follow up from #9 Add energy consumption plot

New plots (conditional tickformat): Example table with converted units: Example table with raw values:

mirpedrol commented 1 year ago

@skrakau I have added a function to convert the values to more readable units, adding kilo, mega, mili... for the trace files (see example in https://github.com/qbic-pipelines/nf-co2footprint/pull/9), do we want to do the same for plots in the report? I think it can be confusing if we output a different unit every time if people want to compare different runs, but on the other hand if values are too close to 0 we won't be able to see the differences.

skrakau commented 1 year ago

Hi @mirpedrol , for the memory plots in the standard execution report, for example, the units are adjusted for each run as well, so this shouldn't be a problem as far as I see.

mirpedrol commented 1 year ago

I am unsure about rounding the values, they are now rounded for the "Raw" plots. Should we also do it for "Readable Units" in the table?

skrakau commented 1 year ago

One plot with adjusted y axis labels should be sufficient.

skrakau commented 1 year ago

Some notes already regarding the tasks table:

the should be a white space in front of the unit (for Human readable)
it might be helpful to display the unit for the raw values, e.g. (gram), as in https://www.nextflow.io/docs/latest/tracing.html#tasks

skrakau commented 1 year ago

A few notes

regarding the currently generated CO2 reports: the plots themselves seem fine, just the y axis tick labels (and the displayed boxplot labels/quantiles) seem not OK. Or do I miss something here?
why are the y axis labels like they are, i.e. [0.0, 0.0, ...., 0.0] ?

would the following help?

the rounding of the quantile values in the 'CO2F*ReportSummary.groovy' mainly seem to cause a problem in combination with discarding zeros and because the stored unit might be not optimal (g, but the amounts for such super fast tasks are more in the range of milligrams or milliwatt-hours). If we could internally store the vales as milligram and milliwatt-hours as milliseconds are used for the time by nextflow, then the rounding wouldn't cause a problem anymore (if it would still make sense to round, would tasks with < 0.005 mg CO2 emission occur?)
a rounding in CO2F*ReportTemplate.js would not be needed, right?
(if one would use milliwatt-hours maybe it would make sense to write the full name of the units to the task table header for the raw values to avoid confusions with MWh?)
remains the issue with the y axis labels for small values: https://plotly.com/javascript/tick-formatting/#using-exponentformat maybe this could help?
consider: tickformat: '.1f' -> tickformat: '.4s'
a remaining question is, if one would want to adjust the y axis tick labels as it is done for the Nextflow Memory plot (no unit given in the y axis label/title, but the tick label says 12T ...) . Or if one would do it as it is done for the Execution time (minutes) , which is always given in minutes, if I see it correctly.

mirpedrol commented 1 year ago

Steps after in person discussions:

Convert all values to mili (raw) and round. For the table convert to readable units.
Display lower values with scientific notation in the y axis. (check after using all values in mili)
Also modify the tickformat. (maybe)
Check if we can add the unit in the plot ticks, what happens if we put miligram instead of gram?

mirpedrol commented 1 year ago

If the maximum value is 4 or lower than 4, we use a tickformat of .2f (displays the values as decimals 0.5, etc), if it's higher than 4 (or some other higher reasonable number) we use .2s (adds the units to the tick value 1k, etc). This way we can be sure that for lower values we will never have a tick with 500m, as the maximum number of displayed ticks is 5.

skrakau commented 1 year ago

4 because, the default nticks value is 5, leading to max. 5 ticks. Thus with values >= 4, the min. tick is at >= 1.0 (g).

skrakau commented 1 year ago

If you think it's better this way, we can also keep it like this

mirpedrol commented 1 year ago

Using .1f sounds good but I don't like .4s because I think it's confusing with thousands. I can't find a pipeline to test with big numbers but the exmaple plots look like this (values around 3 milion)

skrakau commented 1 year ago

I see your point, I guess the reason for the execution plots is that this it allows more accurate values, in particular for the displayed boxplot values

nextflow-io / nf-co2footprint

Update Report - Add energy consumption plots and add values to the table #12