statgen / Minimac4

GNU General Public License v3.0
54 stars 17 forks source link

The imputed genotypes is increased compared to the original vcf file #59

Open liubh9 opened 1 year ago

liubh9 commented 1 year ago

target vcf: image imputed vcf: image When I got the imputed vcf , I compared the stats files between target vcf and it. Then I found the imputed vcf file has nearly three times as many snps as the target vcf file, even after I used --all-typed-sites . Meanwhile, I also found a mini bug , the --output-format doesn't work , so the output file is allways bcf format. The minimac version I used is v4.1.2, and the command are displayed below. image

jonathonl commented 1 year ago

There are supposed to be more variant records in the imputed VCF than the original. These new records are imputed from the reference panel. If you are only wanting to impute missing genotypes of variants that already exist in your original VCF, then I would recommend using phasing software like Eagle (https://github.com/poruloh/Eagle). You should be using phasing software prior to using Minimac4 anyways. Mininac4 assumes the target VCF is phased.

What makes you beleive that your output is BCF? The CI tests use vcf.gz output (https://github.com/statgen/Minimac4/blob/master/test/simple-test.sh#L21) without issue.

abcdef-l commented 1 month ago

你好,我想知道您的这个问题最后怎么解决了