Open timothymillar opened 3 months ago
We could also use a sparse equivalent of GP
if we specify a minimum posterior probability to report. E.g., >= 0.01 would work well with MCMC approximations. Alternatively, we could report a phred score of 0 for non-zero values, but this is confusing.
Investigate a sparse encoding of genotype posteriors. E.g. an equivalent to the
PP
field (phred-scaled probabilities) in which zeoro values are omitted. This can be represented as a map of genotype index to non-zero phred scaled probabilities. This effectively removes genotypes with probabilities <= 0.1. An example may look like"0=10,2=3,7=1"
and have the String type in VCF.