replikation / poreCov

SARS-CoV-2 workflow for nanopore sequence data
https://case-group.github.io/
GNU General Public License v3.0
39 stars 17 forks source link

Only calculate NanoPlot after read filtering step #236

Closed hoelzer closed 2 years ago

hoelzer commented 2 years ago

Right now, NanoPlot and filtering run in parallel. Which means, if someone enters a fastq_pass folder with all 96 barcodes (many of them are empty) the pipeline will anyway start 96 NanoPlot processes. On a Laptop w/ not so much resources this takes extra time.

Suggestion: filter first the reads to get rid of samples/barcodes that are anyway empty and then run NanoPlot.

replikation commented 2 years ago

The counter-argument would be that "one" would like to use nanoplot to check what is going on in their sequencing run. so naturally you don't want to look at filtered results (normally). otherwise, there is no real utility in nanoplot (?)

hoelzer commented 2 years ago

Yeah, also true :) but running empty files might not make much sense? Or do you think it's better to also run empty files and let nanoplot just fail (I think there will be anyway no output then).

It might be also just an issue for my runs here right now where we used 24/96 barcodes and calculate on laptops with not so mich resources. So it just takes quite some additional time to let the 72 barcodes fail in nanoplot that are just empty ;)

But I get your point, maybe it's better to leave it as is so users get the full qc output for all input barcode folders.

replikation commented 2 years ago
hoelzer commented 2 years ago

then let's stick to the current implementation. Maybe it was just annoying for me atm bc/ of running multiple batches on small laptop machines ;)

replikation commented 2 years ago

Maybe a "light mode" might be good. Just the bare minimum? Or something similar

Martin Hölzer @.***> schrieb am Do., 4. Aug. 2022, 16:05:

then let's stick to the current implementation. Maybe it was just annoying for me atm bc/ of running multiple batches on small laptop machines ;)

— Reply to this email directly, view it on GitHub https://github.com/replikation/poreCov/issues/236#issuecomment-1205306210, or unsubscribe https://github.com/notifications/unsubscribe-auth/AIQSLLUT3SPXMWBV3MW5IF3VXPE3XANCNFSM55RNNLJA . You are receiving this because you were assigned.Message ID: @.***>