Closed hhu1 closed 8 years ago
I always keep unphased sites, I can then decide when running MSMC to remove them using the —skipAmbiguous flag.
Best wishes, Stephan
On 26 Feb 2016, at 21:25, hhu1 notifications@github.com wrote:
I am using MSMC over genomes sequenced by Complete Genomics. Based on the Schiffels & Durbin 2014 paper, unphased sites would introduce bias for population split analysis. However, when I looked into run_shapeit.sh tool, it seems that phasing was performed only on SNVs present in shapeit2 reference panel. Afterwards, both phased and unphased sites were merged into the same vcf file.
My question is, should I keep the unphased sites (those not present in the reference phasing panel) in my vcf file? If not, should I somehow fix the mask file to reflect the fact that only sites present in the reference panel are callable?
Thanks very much for your suggestions,
Hao Hu
— Reply to this email directly or view it on GitHub https://github.com/stschiff/msmc/issues/18.
I am using MSMC over genomes sequenced by Complete Genomics. Based on the Schiffels & Durbin 2014 paper, unphased sites would introduce bias for population split analysis. However, when I looked into run_shapeit.sh tool, it seems that phasing was performed only on SNVs present in shapeit2 reference panel. Afterwards, both phased and unphased sites were merged into the same vcf file.
My question is, should I keep the unphased sites (those not present in the reference phasing panel) in my vcf file? If not, should I somehow fix the mask file to reflect the fact that only sites present in the reference panel are callable?
Thanks very much for your suggestions,
Hao Hu