Open apeltzer opened 3 months ago
Ask @fmalmeida what he had to do to make this work :)
I thought there was a "don't cache" setting somewhere, and it was intended, but there's not. It happens on every nf-core pipeline...
@ewels Any thoughts on where this is coming from?
Might be better to move this to tools.
Thought the same initially, but its not been set here. Not a major problem here anyways (and negligible runtime too, considering how much $$$ go into demuxing an entire flowcell ;-)).
I woudn't say the runtime is negligible... on a recent large flow cell, multiqc ran for ~1h (not sure how much time was wasted on staging-in files though).
I also never got why one would intentionally not resume multiqc...
Hey hey hey,
The main thing that makes the MultiQC module do not cache is the cache = false
that sometimes is added as @edmundmiller mentioned, but mainly the fact that many run-specific variable metadata is added to the MultiQC Summary Map wich makes this input-map of metadata always different for every run, and thus, never caching, see here:
https://github.com/nf-core/demultiplex/blob/master/lib/NfcoreTemplate.groovy#L72-L95
This means that its not so easy to adapt this without changing the workflow_summary_mqc.yaml and methods_description_mqc.yaml by changing whats ingested into these two YAML files as there are some variables that contain timestamps and thus are updated on any resume. To be more explicit lets close this ticket, enable caching = false
in the conf/modules.config
for multiqc (so that users get what they think they will get) and leave it as is. If we at some point decide to take this on, I would suggest we can still do this in a next / patch release. Thanks for your points @fmalmeida :)
I assessed this in the current dev branch (commit id: 892b9d8cc5beade252777428bd6df440dd874468). The main conflicting channel is ch_multiqc_files, which contains two files that are different with each execution: workfow_summary_mqc.yaml and methods_description_mqc.yaml.
These files are modified with each execution because they contain some data like timestamp of execution, runName, among others. In order to have multiqc resume we would need to:
Thanks for the analysis... If this is to be changed, then it should happen at the pipeline template level in nf-core/tools
.
I will file an issue there and we can take it up once this has been agreed upon in the wider community - will x-ref this ticket here so we can take it up once there was a decision in the community... :) See this one: https://github.com/nf-core/tools/issues/3110
unclear if this is intended or not --> verify