Open lbeltrame opened 7 years ago
Actually, it can be already done, so closing. Sorry for the noise.
Spoke too soon. cutadapt
has a -l
option (similar to CROP
) which is missing from atropos:
To shorten each read down to a certain length, use the --length option or the short version -l:
(https://cutadapt.readthedocs.io/en/stable/guide.html#shortening-reads-to-a-fixed-length)
I like this idea. Rather than copy over the -l option from cutadapt, I think I can repurpose -u/-U in such a way as to support the current usage while making it more flexible. Something like:
atropos -u 5,15 -U 10
would mean trim bases 5-15 in read 1 and trim off the first 10 bases of read 2.
In the longer term (i.e. v2.0+), I'd like to provide the ability define adapter layouts (probably via a config file, as this will be too complex for command line parameters). For example, with molecular barcodes, it would be nice to say "cut out bases 5-15 and make the sequence available as a variable that I can inject into the fastq header." (I think there is a trimmer out there that implements something like this, but I can't remember the name.)
That would be great indeed. I'm trying to move away from trimmomatic where possible because atropos is much faster.
Sometimes I just want to trim "from xx bases onward" or "take just the first xx bases". In trimmomatic (considerably slower than atropos) this is obtained with the
HEADCROP
andCROP
commands, respectibely.The rationale for this is stripping unique molecular identifiers (UMIs) from the actual reads from some targeted sequencing panels.
HEADCROP
can probably be replaced with--cut
which, according to the documentation, as it just cuts a fixed number of bases.In sequencing chemistries that produce fixed-length reads like Illumina, a temporary
CROP
-like solution could be done with--cut -(read length - adapter)
but sounds a bit hacky.