Closed ian-small closed 1 year ago
Thanks for the bug report, @ian-small. This is fixed in #261. A new patch version with the fix will be released in ~15 minutes. It may take an hour or so before the update reaches the package servers, but then you should be able to update to 3.1.2 and get the fix.
If you want it to work in the meantime, you can also simply port the 1-line fix from #261 in your code and dev
BioSequences.
Expected Behavior
LongSequence -> LongSubSeq -> LongSequence should generate a copy of the original sequence
Current Behavior
Under some circumstances, converting a LongSubSeq to a LongSequence generates weird results; specifically some bases at the start of the sequence are altered to ambiguity codes
Steps to Reproduce (for bugs)
julia> test = dna"CATTTTTTTTTTTTTTT" 17nt DNA Sequence: CATTTTTTTTTTTTTTT
julia> testview = LongSubSeq(test, 1:17) 17nt DNA Sequence: CATTTTTTTTTTTTTTT
julia> LongSequence(testview) 17nt DNA Sequence: YATTTTTTTTTTTTTTT
This is a simple example; if test is a longer sequence, more ambiguous bases are present in the final sequence. As far as I have seen so far, to get this bug(?), the sequence view must start from the first position of the sequence, and the view must be more than 16 bases.
Context
I was writing a simple short-read assembler and generating a de Bruijn graph based on views into the reads and then generating contigs by extending from the first view; done like this, every created contig starts with a slew of ambiguity codes.
Your Environment