Open edg1983 opened 3 months ago
Can you check to see if you have enough disk space in /tmp to store the chunked results? I think we would have seen an error message earlier in the logs if an error occurred writing the temp files, but that's the only good explanation I have for why this would happen.
Otherwise, is there anything special about the variant immediately after the last one written to output file? What operating system are you running this on and how did you install Minimac4?
Hi,
I don't think the issue is related to storage space. I see in the log files a message like Writing temp files took 344 seconds
; hence, I assume that all temp files were written correctly.
I currently use Minimac4 on our HPC cluster, which runs on CentOS 8. We grabbed the pre-compiled executable provided with the release on GitHub. It has worked fine so far in all other tests; it is just the -a
option that creates issues apparently.
I'll check if I see anything strange in the last variant written to the file and the next one in the imputation ref panel.
Ok, if there is something strange, I'm guessing it will be in the next variant in your target VCF (as opposed to the reference VCF).
I'm guessing that this is happening because there is target-only variant that has all of the genotypes missing for a batch of samples. This is a bug that I'll need to fix, though phasing software should impute such genotypes. Are you phasing your target vcf before imputing?
Hi, I'm imputing VCF files from genotyping directly after QC without phasing them.
I'm now re-running the test with -a
option to check on the last written variant and the next one in the input VCF. I'll update you here as soon as this is done.
You will get very poor imputation results if you impute unphased genotypes (or if you impute with an unphased reference panel). Both input files should be phased.
I've tried with imputed genotypes, and I confirm this works fine with the -a
option.
Hi,
I'm using minimac4 v4.1.3 to impute genotypes on a cohort of about 24k individuals.
Usually, I run one imputation job per chromosome. When I run the command with mostly default settings, it works fine and generates an output vcf.gz file containing the same number of variants of the reference panel file (as expected). See this example:
However, if I add the
-a
option, I have an error merging the temp files at the end of the imputation process. The resulting vcf.gz file is truncated and contains fewer variants than those in the reference panel.This is the command
Here is the error from the log
Am I doing something wrong here? Thanks!