Right now the default imputation scheme converts missing data into ancestral alleles, which creates a strong affinity among samples with lots of missingness. What if instead of this, missing data is imputed randomly as ancestral or derived. The idea is that the random values at the missing sites won't by chance create spurious affinities, and the signal in the true shared variation should be recovered. Just a thought....
Right now the default imputation scheme converts missing data into ancestral alleles, which creates a strong affinity among samples with lots of missingness. What if instead of this, missing data is imputed randomly as ancestral or derived. The idea is that the random values at the missing sites won't by chance create spurious affinities, and the signal in the true shared variation should be recovered. Just a thought....