alexstaj / cutadapt

Automatically exported from code.google.com/p/cutadapt
0 stars 0 forks source link

optional trimming #54

Closed GoogleCodeExporter closed 9 years ago

GoogleCodeExporter commented 9 years ago
Hi, firstly thanks for cutadapt, it is the best trimmer I have found.

I was going to give you a github pull request but am unsure whether to make a 
command line argument change, here's the background:

In one project I had a mix of different adapters and I was using cutadapt's 
ability to redirect trimmed/untrimmed output into different files to separate 
them by the adapter.

These adapters were actually matches to the genomic sequence, and I found that 
if I left them in, I was able to save a lot more reads (due to having longer 
reads to map)

As cutadapt has a very good matching algorithm which works with partially 
overlapping sequences, I wanted to re-use it, so added a "trim" parameter with 
a default=True which preserves current behavior.

Setting --trim=False allows you to match partially overlapping adapters, and 
redirect it to a different output file, while not actually removing it - in 
effect turning cutadapt into matchadapt as well.

I think this is a useful feature, and a pretty trivial change except that now a 
match and trim don't always happen together which means maybe we should rename 
the parameter "untrimmed_output" to "unmatched_output" which would break 
backwards compatability....

(You could also add another variable "matched" to stats).

Thanks!

Original issue reported on code.google.com by davm...@gmail.com on 30 Oct 2012 at 1:31

GoogleCodeExporter commented 9 years ago
Hello and sorry I've not replied until now. I think this would be a good 
feature to have. Would you be willing to re-base your patch onto the current 
master of cutadapt and submit a pull request? I also suggest you call the 
parameter "--no-trim". 

Although the --unmatched-output option could simply be added as an alias for 
--untrimmed-output without removing the old option, not breaking backwards 
compatibility, I tihnk it's not necessary to add it at all. The meaning of the 
options needs to be explained anyway and if the short name doesn't fully 
represent what it does then that's ok.

Regarding the 'matched' variable in stats: I think that's now already 
implemented although I've called it 'reads_changed'. The output could simply be 
changed to not say 'Trimmed reads' but 'Matched reads' instead.

Original comment by marcel.m...@tu-dortmund.de on 4 Dec 2012 at 9:54

GoogleCodeExporter commented 9 years ago
Hi Marcel. I've sent you a pull request via github.

Original comment by davm...@gmail.com on 15 Dec 2012 at 2:21

GoogleCodeExporter commented 9 years ago
Thanks! I have merged the pull request and the option will be available in 
cutadapt 1.3.

Original comment by marcel.m...@tu-dortmund.de on 17 Dec 2012 at 8:53