EichlerLab / pav

Phased assembly variant caller
98 stars 8 forks source link

distinction between missing .|1 and het 0|1 deletion #30

Closed bioinfogit closed 1 year ago

bioinfogit commented 1 year ago

I am wondering what is the difference between missing deletion .|1 and het 0|1 so in cases when we don't have any mapped contigs h1 vs h2 pav call it as missing ? should we filter them ?

paudano commented 1 year ago

In the genotype, h1 is first and h2 is second. If the genotype is a "." for one of the haplotypes, then there were no alignments in that haplotype to support a variant call. It's a sign that the locus might be unreliable, and you'll see more missing haplotypes in regions that are difficult to assemble and properly align, but it doesn't mean that the callable haplotype is always erroneous. I try to get orthogonal support from multiple callers including other assembly-based tools, like SVIM-asm, and read-based tools, like PBSV and Sniffles2, then use that to guide filtering decisions.

bioinfogit commented 1 year ago

Thanks for the explanation