FelixKrueger / TrimGalore

A wrapper around Cutadapt and FastQC to consistently apply adapter and quality trimming to FastQ files, with extra functionality for RRBS data
GNU General Public License v3.0
467 stars 151 forks source link

`--max_n` value as a fraction #137

Closed bounlu closed 2 years ago

bounlu commented 2 years ago

--max-n parameter in cutadapt accepts a fraction of read length in addition to the number of Ns, however, trim_galore seems to only accept as counts for this parameter. How to make trim_galore compatible with the fraction? What happens if I pass a value of fraction to trim_galore?

trim_galore:

--max_n COUNT           The total number of Ns (as integer) a read may contain before it will be removed altogether.
                        In a paired-end setting, either read exceeding this limit will result in the entire
                        pair being removed from the trimmed output files.

cutadapt:

--max-n COUNT           Discard reads with more than COUNT 'N' bases. If COUNT is a number between 0 and 1,
                        it is interpreted as a fraction of the read length.
FelixKrueger commented 2 years ago

I suppose this shouldn't be too hard to implement, is this something you would require?

FelixKrueger commented 2 years ago

I have now had a go at implementing this (e6ed90a63d3e7e34b93626e4f9aa0e3d69d52981), can you try and clone the dev branch and see if it works for you?

bounlu commented 2 years ago

Really appreciate the quick effort, however I have switched to cutadapt for more flexible options overall.

FelixKrueger commented 2 years ago

That's fine, maybe someone else will want it at some point.