Closed biona001 closed 4 years ago
Missing data in the array genotypes are usually imputed during phasing.
Minimac4 is suitable only when large panels are used. It uses some approximations that might not give good accuracy in smaller panels. For small panels (less than HRC) please use minimac3 if possible.
Regards, Sayantan Das,
23andMe
On Thu, Apr 16, 2020 at 5:44 PM Jonathon LeFaive notifications@github.com wrote:
Missing data in the array genotypes are usually imputed during phasing.
— You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHub https://github.com/statgen/Minimac4/issues/31#issuecomment-614970949, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB5YQCFP3L5UOIWCPQ3WKDTRM6Q67ANCNFSM4MKKSOHQ .
Thank you both. According to @jonathonl this must mean Minimac 4 (and 3 also?) will change the phased entries? So observed entries in my original (unphased) data may not be preserved?
Yes correct.
On Thu, Apr 16, 2020, 6:34 PM Benjamin Chu notifications@github.com wrote:
Thank you both. According to @jonathonl https://github.com/jonathonl this must mean Minimac 4 (and 3 also?) will change the phased entries? So observed entries in my original (unphased) data may not be preserved?
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/statgen/Minimac4/issues/31#issuecomment-614986640, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB5YQCDOTR7G52VTUGVXXWDRM6W3FANCNFSM4MKKSOHQ .
I know that Minimac4's target genotypes files should be phased (so using
|
as allele separators for GT field in thevcf
file). However, should the missing data be.|.
or.\.
or something else?For instance, if I simulate 100 sample's complete genotype (phase known) and mask 1% of the entries (represented by
.\.
), would this file be "acceptable" for Minimac4?