YeoLab / gscripts

General Use Scripts and Helper functions
MIT License
18 stars 18 forks source link

Demultiplexing Issue - X1A / X1B Adapters #93

Open BuddahKat opened 5 years ago

BuddahKat commented 5 years ago

Hi,

I've noticed an issue when trying to demultiplex reads containing the X1A / X1B RNA adapters. For some reason the script puts all of the reads in one file (obviously not demultiplexed) and also incorrectly trims the read (an extra 2nt after the adapter is trimmed from the read). Maybe it has something to do with the adapters beginning with N's? I can't seem to figure it out.... any help would be appreciated. I've included some test files below.

X1B (forward) @A00405:18:H3HJ3DSXX:4:1101:9372:1000 1:N:0:ATTACTCG+GGCTCTGA AACTATGCTATTTCAGGGGAGCCATCCGAGGATTGCGGGAGAAGGCATGGGGCAGGAGCAACCTGTTAGTGGATGGAGAGATCGGAAGAGCACACGTCTGAACTCCAGTCACATTACTCGATCTCGTATGCCGTCTTCTGCTTTAAAAAAT

X1A (forward) @A00405:18:H3HJ3DSXX:4:1101:12753:1000 1:N:0:ATTACTCG+GGCTCTGT TATCCCCTATATCAGAGGCTGGACATCAATGGCAGATGATGCCAAAGTCATAGGGTTTTGCCTTTGTGTACCATGCATAGGCTCCAAAGCATGACCTAGGTATGGATAAGATCGGAAGAGCACACGTCTGAACTCCAGTCACATTACTCGA

(in bold is what was trimmed, including two extra nt beyond the adapter)

After running 'demux_paired_end.py' with the following barcode file:

NNNNNCCTATAT  X1A NNNNNTGCTATT  X1B

these are the resulting reads:

@CTCCATCCAC:A00405:18:H3HJ3DSXX:4:1101:9372:1000 1:N:0:ATTACTCG+GGCTCTGA AGGGGAGCCATCCGAGGATTGCGGGAGAAGGCATGGGGCAGGAGCAACCTGTTAGTGGATGGAGAGATCGGAAGAGCACACGTCTGAACTCCAGTCACATTACTCGATCTCGTATGCCGTCTTCTGCTTTAAAAAAT

@TATCCATACC:A00405:18:H3HJ3DSXX:4:1101:12753:1000 1:N:0:ATTACTCG+GGCTCTGT GAGGCTGGACATCAATGGCAGATGATGCCAAAGTCATAGGGTTTTGCCTTTGTGTACCATGCATAGGCTCCAAAGCATGACCTAGGTATGGATAAGATCGGAAGAGCACACGTCTGAACTCCAGTCACATTACTCGA

Both of these reads end up in the 'X1B' output file even though the 2nd one is 'X1A'. The 'X1A' output file is completely empty. Also again there's 2nt that are trimmed that aren't from the adapter...

Any ideas what's going on?