niemasd / ViReflow

An elastically-scaling automated AWS pipeline for viral consensus sequence generation
GNU General Public License v3.0
10 stars 2 forks source link

Add "stop mapping reads at X successfully-mapped reads" as optional feature #11

Closed niemasd closed 3 years ago

niemasd commented 3 years ago

If a FASTQ file is way more depth than necessary, it's nice to have an optional feature to stop mapping reads after X successfully-mapped reads (where X is user-defined) to speed up the downstream analyses. For example, perhaps sequencing depth is 1000X, but the user is confident that 100X coverage is sufficient for high-confidence results, so they can truncate at the number of successfully-mapped reads that would equate to 100X coverage.

https://github.com/ucsd-ccbb/covid_sequencing_analysis_pipeline/blob/main/pipeline/sarscov2_consensus_pipeline.sh#L78

niemasd commented 3 years ago

Initial attempt added in this commit: https://github.com/niemasd/ViReflow/commit/efdceecd757a84a155769584c7b7f5d95a489023

Need to test once updated ViReflow Docker image is ready

niemasd commented 3 years ago

Fixed in this commit (read mapper has non-zero exit status when it gets truncated): https://github.com/niemasd/ViReflow/commit/6e03e3d5e84e7ecccb843cb4fbafbb9f8f3b7041