jdidion / atropos

An NGS read trimming tool that is specific, sensitive, and speedy. (production)
Other
120 stars 15 forks source link

RFE: SImilar options to trimmomatic's CROP #50

Open lbeltrame opened 6 years ago

lbeltrame commented 6 years ago

Sometimes I just want to trim "from xx bases onward" or "take just the first xx bases". In trimmomatic (considerably slower than atropos) this is obtained with the HEADCROP and CROP commands, respectibely.

The rationale for this is stripping unique molecular identifiers (UMIs) from the actual reads from some targeted sequencing panels.

HEADCROP can probably be replaced with --cut which, according to the documentation, as it just cuts a fixed number of bases.

In sequencing chemistries that produce fixed-length reads like Illumina, a temporary CROP-like solution could be done with --cut -(read length - adapter) but sounds a bit hacky.

lbeltrame commented 6 years ago

Actually, it can be already done, so closing. Sorry for the noise.

lbeltrame commented 6 years ago

Spoke too soon. cutadapt has a -l option (similar to CROP) which is missing from atropos:

To shorten each read down to a certain length, use the --length option or the short version -l:

(https://cutadapt.readthedocs.io/en/stable/guide.html#shortening-reads-to-a-fixed-length)

jdidion commented 6 years ago

I like this idea. Rather than copy over the -l option from cutadapt, I think I can repurpose -u/-U in such a way as to support the current usage while making it more flexible. Something like:

atropos -u 5,15 -U 10

would mean trim bases 5-15 in read 1 and trim off the first 10 bases of read 2.

In the longer term (i.e. v2.0+), I'd like to provide the ability define adapter layouts (probably via a config file, as this will be too complex for command line parameters). For example, with molecular barcodes, it would be nice to say "cut out bases 5-15 and make the sequence available as a variable that I can inject into the fastq header." (I think there is a trimmer out there that implements something like this, but I can't remember the name.)

lbeltrame commented 6 years ago

That would be great indeed. I'm trying to move away from trimmomatic where possible because atropos is much faster.