Closed ymcki closed 11 months ago
It's a double edge sword. While during eye checking, some FP can be easily caught and be corrected (FP->TN), implementing the same rule as a filter that applies to all variants could also cause a considerable amount of TP switching to FN.
I think FPs caused by homopolymer are likely to be distributed to the two haplotypes evenly whereas true mutations should only concentrate in one of the two haplotypes.
Therefore, I think a simple statistical test should be able to distinguish the two when coverage is high enough.
Of course, if you want to go the deep learning way, you can also use the eyes to generate a truth set to train on.
Dear Clair3 team,
I noticed that there are quite many false positive variant calls (mostly single base indels) near the end of homopolymer which gave me quite many distractions when looking for disease causing variants. I understand that this is mostly due to the nature of nanopore technology but it would be great if Clair3 can also do something about it.
call the phase of an aligned read by looking at heterozygous variant calls), I am able to visualize what's going on for variant calls near the end of a homopolymer. With the help of phasing info from whatshap, I think I can pinpoint most of the false positive calls intuitively and significantly reduce the false positive rate.
wrong calls near the end of homopolymer. It would be great if a fixed bam file is also outputted as well for visualization in IGV.
go?
Thank you very much for your time.