clemente-lab / mmeds-meta

A database for storing and analyzing omics data
https://mmeds.org
2 stars 1 forks source link

mismatch read in strip_error_barcodes #465

Open adamcantor22 opened 7 months ago

adamcantor22 commented 7 months ago

Describe the bug When running demultiplexing of MTC 16S Run23, one sample, "EAR4", ended up with one more read in the reverse file than in the forward after the strip_error_barcodes step. I used diff, zgrep, and awk to determine the extraneous sample and removed it. This was that read:

@M02617:814:GW231120000:1:1114:21145:14315 2:N:0:CGTAGCGA-CTACTATA GCCAGTTTGGGTCTTGGCTATTGTGAGATCAGATATGTTAAAGCCACTTTCGTAGTCTATTTTGTGTCAACTGGAGTTTTTTACAACTCAGGTGAGTTTTAGCTTTATTGGGGAGGGTGTGATCTAAAACACTCTTTACGCCGGCTTCTA + 111>A113@111A1B1B1BDBAG333D3311FF3GF2D2AF2FBECGB11F?GEFFEF2DDDHHFGFHDGHF00F01A12//BFHHFGHHFEFGHAGHHGFGHHHHGFHGFH?FGGG/?E/GFGBGDF1BGGGFHHHHEFC?BC/BCCGF

Completely unclear to me why this has happened, never happened before. Will have to take a closer look at this read, its associated forward read. I suspect it has something to do with a weird special character.