Closed GoogleCodeExporter closed 9 years ago
That’s interesting – when I changed that, I thought that would be the
correct behavior. In your case, of course, the problem is that the sequence is
extremely short and therefore is found due to chance quite often.
I won’t have time to look into this within the next weeks, but perhaps the
following works in your case: You can try to append a character to your adapter
sequence that does not occur in your reads, such as an “X”. That is, simply
use the adapter sequence “TGACX”. Cutadapt won’t find the sequence
anywhere within the read, only at the end, where the X is just ignored. I hope
that gives you the same results as before.
Original comment by marcel.m...@tu-dortmund.de
on 3 May 2013 at 4:00
I agree that in most cases, the new behaviour will be the correct one, so no
fault on your part there. Truth be told, my use-case is so simple that sed
might just do the trick.
I was mostly reporting the issue because it changes existing behaviour in a way
that I hadn't seen documented in the change-log. I'm currently rerunning an
analysis pipeline I built a few months back, and I updated my versions of
cutadapt/bowtie/tophat/etc before doing so. However the results were off, and
it took me a bit of time to figure out it was because of this change in
cutadapt. No big loss on my part (I'll just run it again), but it would have
been nice if the new behaviour had been implemented using a new switch rather
than -a, or if it would have been in the change-log.
Changing the switch now would probably be a bad idea (There's certainly a
script out there that relies on the new behaviour, so it's a catch-22), but a
note in the change-log would be nice for future users!
Anyhow, this is a very minor gripe in an otherwise great piece of software.
Thank you very much for your efforts in writing it and making it available!
Best regards,
-Eric Fournier
Original comment by ericfour...@gmail.com
on 3 May 2013 at 4:25
Ok, so this was more of a „please document your changes better“ rather than
a „I cannot use cutadapt anymore because you broke it“ request.
Sorry about the time it took you to find the problem, I know how annoying it is
when tools do that.
In my defense, I did document the change:
“Improved the alignment algorithm for better poly-A trimming when there are
sequencing errors. Previously, not the longest possible poly-A tail would be
trimmed.”
Admittedly, this does not include the information that trimming behavior in
general has changed. I simply didn’t foresee that this would be relevant for
non-poly-A trimming.
I try to avoid making backwards incompatible changes and I’ll refrain even
more from doing so in the future. However, I considered the old behavior to be
a bug and I want the default to be as useful as possible, which is why I
changed the behavior of the -a parameter.
I’ll leave the report open for a while until I have some time to solve this
in a more general way, perhaps by adding a few more automatic tests.
Original comment by marcel.m...@tu-dortmund.de
on 3 May 2013 at 6:04
I'd like to second the utility for a lazy/ungreedy option. This would be
useful, for example, in iteratively trimming short repeats of variable number
and length.
Original comment by david.ko...@gmail.com
on 7 Jul 2013 at 5:41
I’m just going through old, unresolved issues. David, Eric, in case you’re
still interested in this, I wonder whether the workaround I mentioned above is
sufficient. The idea was to add some 'X' characters to the end of the adapter
sequence. The effect is that the adapter is only found if it’s at the end of
the read, but not within the read (where the 'X' characters cause mismatches).
Original comment by marcel.m...@tu-dortmund.de
on 5 Nov 2014 at 1:39
Hi! Sorry for the delay in posting a reply, as I don't check gmail very often
and that's where my google code notifications get sent. The proposed workaround
seems perfectly fine with me. Thanks for following up!
Original comment by ericfour...@gmail.com
on 15 Dec 2014 at 6:25
Thanks for getting back! I’ll close this issue then. I’ll put the advice
about using the ‘X’ as part of the adapter sequence into the documentation.
Original comment by marcel.m...@tu-dortmund.de
on 16 Dec 2014 at 1:25
Original issue reported on code.google.com by
ericfour...@gmail.com
on 3 May 2013 at 3:07