YosefLab / Cassiopeia

A Package for Cas9-Enabled Single Cell Lineage Tracing Tree Reconstruction
https://cassiopeia-lineage.readthedocs.io/en/latest/
MIT License
75 stars 24 forks source link

Potential Bug in intBC extract #237

Open jefferyUstc opened 6 months ago

jefferyUstc commented 6 months ago

Hi Cassiopeia Team,

I carefully checked the result from Cassiopeia, and Found a bug in some reads in SRR11357694, for example: AATCCAGCTAGCTGTGCAGCTCTCCGTTAGACATTTCAACTGCAGTAATGCTACCTCGTACTCACGCTTTCCAAGTGCTTGGCGTCGCATCTCGGTCCTTTGTACGCCGAAAAATGGCCTGACAACTAAGCTACGGCACGCTGCCATGTTGGGTCATAACGTGGTTCATCCGTGACCGAACATGTCATGGAGTAGCAGGAGCTATTAATTCGCGGAGGACAATGCGGTTCGTAGTCACTGTCTTCCGCAATCGTCCATCGCTCCTGCAGGTGGCCTAGAGGGCCC

with CIGAR(34M1D127M7D124M), we could manually map this to reference, it should be:

image

And the intBC from Cassiopeia is TCTCCGTTAGACATT. from above, it seems that it should end with AT, rather than ATT in this context.

More information for you:

readName cellBC UMI readCount Seq CIGAR QueryBegin ReferenceBegin AlignmentScore r1 r2 r3 allele intBC cellbc_umi
GGGAATGAGGGATCTG-TGCTACCGTA GGGAATGAGGGATCTG_TGCTACCGTA_000117_0+ GGGAATGAGGGATCTG TGCTACCGTA 117 AATCCAGCTAGCTGTGCAGCTCTCCGTTAGACATTTCAACTGCAGT... 34M1D127M7D124M 0 0 1309 CCGAA[None]AAATG TAACG[163:7D]TGGTT ATTCG[None]CGGAG CCGAA[None]AAATGTAACG[163:7D]TGGTTATTCG[None]C... TCTCCGTTAGACATT

could you figure out why ? Thanks

mattjones315 commented 5 months ago

Hi @jefferyUstc,

Could you send us the command you're using to perform alignment & indel extraction? And the entry from the molecule table that is giving you issue?

Thanks, Matt