i am observing differences in the output of demuxlet using the exact same input, depending if i do the pileup first or directly run in on the bam file/ using the same vcf file
while overall they agree on 90% of the calling, overall the LLK is different for all cells even those with same calls, but i am trying to understand this 10% different calls which one is more trustable? any ideas?
here you can see those cells that differs, where the pileup call all those as SNG while runing demuxlet specify all those as DBL
same thing in other direction demuxlet without pileup called it SNG while with pileup is DBL
comparing to the overall disruption of LLK differnces , those are borderline cells, low values of diff_sng-DBL llk, but its still 10 % of the cells, so overall the choice will have a huge IMPACT
This is not surprising. There are small differences in the default parameters to filter reads in the two different versions. I believe that what pileup version is doing is more appropriate.
Hi,
i am observing differences in the output of demuxlet using the exact same input, depending if i do the pileup first or directly run in on the bam file/ using the same vcf file
while overall they agree on 90% of the calling, overall the LLK is different for all cells even those with same calls, but i am trying to understand this 10% different calls which one is more trustable? any ideas?
here you can see those cells that differs, where the pileup call all those as SNG while runing demuxlet specify all those as DBL
same thing in other direction demuxlet without pileup called it SNG while with pileup is DBL
comparing to the overall disruption of LLK differnces , those are borderline cells, low values of diff_sng-DBL llk, but its still 10 % of the cells, so overall the choice will have a huge IMPACT
best Marwan