Closed timothymillar closed 2 years ago
Related to #145
The clearest way to signpost the issue not successfully assembling any haplotypes at a loci may be to just use the filter column and skip filtered loci in downstream analysis by default
Done in #147
In
mchap assemble
, If the reference allele is not assembled for any sample it is still reported in the output as a requirement of the VCF format. This can have unexpected downstream side-effects, for example, if those samples are then recalled usingmchap call
the reference allele will be used as a valid haplotype. This can even result in situations where the reference allele is the only input allele resulting in genotypes which are homozygous for that allele even if there is no evidence of it being present in any sample.A solution to this problem would be for
mchap assemble
to report in an info field/flag if the reference allele is only reported as a requirement of the VCF format rather than actually being observed. Then it could be excluded in downstream analysis. This may result in no input haplotypes for the downstream analysis in which case all haplotypes would be reported as unknown.