Filtering low read count samples during QC

Description of feature

Currently (2.5.0), empty input files can be ignored with --ignore_empty_input_files, or samples after trimming with --ignore_failed_trimming. "empty input files" is checked based on compressed fastq file sizes (< 1.KB) using file.size(), see subworkflows/local/parse_input.nf and subworkflows/local/cutadapt_workflow.nf. A better solution might be file.countFastq(), example from here:

channel
    .fromPath( 'data/yeast/reads/*.fq.gz' )
    .map ({ file -> [file, file.countFastq()] })
    .filter({ file, numreads -> numreads > 25000})
    .view ({ file, numreads -> "file $file contains $numreads reads" })

This way an exact read count threshold could be defined and even modified if desired. The disadvantage might be the computational overhead to check file lines.

nf-core / ampliseq

Filtering low read count samples during QC #556

Description of feature