vpc-ccg / scTagger

10x single cell short- and long-read RNA sequencing
MIT License
11 stars 0 forks source link

Extracted Barcode in lr_bc_matches does not match queried Barcodes #4

Open jkbenotmane opened 2 years ago

jkbenotmane commented 2 years ago

Hello, I was testing scTagger with 10X Visium Data and was a little confused by the Output of "match_lr_bc-trie.py". What are the Column Names for the Output? I assumed it should be: ReadID, Edit Distance, Strand, Adapter, Barcode, is this correct ?

How come it matches Barcodes, that are not in the short reads ? Are those the error riddled Barcodes from the read itself or how to understand this ? Is there any order in the matched barcodes separated by ',' or are they all equally good matches ?

ghazaleb75 commented 2 years ago

Hi, The output of "match_lr_bc-trie.py" is:

jkbenotmane commented 2 years ago

Thanks for clarifying !

I also wondered why I see barcode marchds in the final output file that are not in the queried sr-bc file ? Where do they come from ?

baraaorabi commented 2 years ago

Can you check if the extra barcodes are just the reverse complement of existing barcodes? it's probably that

jkbenotmane commented 2 years ago

Hi, I checked for forward, reverse, complement and reverse complement and indeed all bcs are now found.

Is there any reason why not only the matching input barcodes are reported by scTagger ? I just think it Would make analysis later on easier and more straightforward.