Open lingjoyo opened 7 months ago
Which version are your running (minimac4 --version
)?
Is it possible that you are running out of disk space to store the output files?
Hi Jonathonl, Thans for your reply.
It's minimac v4.1.6. The computing resources are:
storage 16T free space cup 32 memory 370G
It shouldn't be the problem of space. By now, it's running well on chr1, chr2.
Can you provide the full log output?
Here is the log :
minimac v4.1.6
Imputing 11:1-20000000 ... Loading target haplotypes ... Loading target haplotypes took 1 seconds Loading reference haplotypes ... Loading reference haplotypes took 2 seconds Typed sites to imputed sites ratio: 0.00066783 (246/368357) 4426 variants are exclusive to target file and will be excluded from output Running HMM with 1 threads ... Completed 200 of 1401 samples Completed 400 of 1401 samples Completed 600 of 1401 samples Completed 800 of 1401 samples Completed 1000 of 1401 samples Completed 1200 of 1401 samples Completed 1400 of 1401 samples Completed 1401 of 1401 samples Running HMM took 392 seconds
Writing temp files took 49 seconds Merging temp files ... Error: I/O failed while merging Error: failed merging temp files
I would try running with --temp-prefix c${chr}.tmp_
so that the temp files are written to the same directory as your output file.
It works well if I put everything into one folder:
./minimac4 1000g_phase3_v5.chr22.with_parameter_estimates.msav \
qc_3rd-updated-chr22.vcf.gz \
-o c22.imputed.vcf.gz \
--min-r2 0.3 --min-ratio 1e-6 \
--temp-prefix c22.tmp_
But it will report the merging error if I give the absolute path to all inputs and outputs:
${minimac4} \
${g1k_p3}1000g_phase3_v5.chr${chr}.with_parameter_estimates.msav \
${wkdir}/1.2_preinputation_check/qc_3rd-updated-chr${chr}.vcf.gz \
-o ${wkdir}1.3_imputaion_minimac4_g1kp3/c${chr}.imputed.vcf.gz \
--min-r2 0.3 --min-ratio 1e-6 \
--temp-prefix c${chr}.tmp_
Here is the log:
Imputing 22:1-20000000 ... Loading target haplotypes ... Loading target haplotypes took 0 seconds Loading reference haplotypes ... Loading reference haplotypes took 1 seconds Typed sites to imputed sites ratio: 1.53001e-05 (1/65359) 691 variants are exclusive to target file and will be excluded from output Running HMM with 1 threads ... Completed 200 of 1401 samples Completed 400 of 1401 samples Completed 600 of 1401 samples Completed 800 of 1401 samples Completed 1000 of 1401 samples Completed 1200 of 1401 samples Completed 1400 of 1401 samples Completed 1401 of 1401 samples Running HMM took 32 seconds Writing temp files took 3 seconds Merging temp files ... Error: I/O failed while merging Error: failed merging temp files
So the problem is that the code couldn't find the temp file. When I set --temp-prefix ${wkdir}/1.2_preinputation_check/c${chr}.tmp_
, it reported
minimac v4.1.6
Imputing 22:1-20000000 ... Loading target haplotypes ... Loading target haplotypes took 0 seconds Loading reference haplotypes ... Loading reference haplotypes took 1 seconds Typed sites to imputed sites ratio: 1.53001e-05 (1/65359) 691 variants are exclusive to target file and will be excluded from output Running HMM with 1 threads ... Error: could not open temp file (/full-path-to/1.3_imputaion_minimac4_g1kp3/c22.tmp_0_XXXXXX)
I guess the problem is about the setting to temp files. What's the right way of setting --temp-prefix if I want to submit the job using SBATCH?
Relative vs absolute paths shouldn't matter. I'm guessing that the output paths are invalid or unreachable from the compute node. Are you creating the full directory paths before running minimac4 (i.e., does the /full-path-to/1.3_imputaion_minimac4_g1kp3/ directory already exist)? I would add tests to your batch script before the minimac4 command to test that you can create new files in the directory you are writing output files. This would look something like:
set -e
out_vcf=${wkdir}/1.3_imputaion_minimac4_g1kp3/c${chr}.imputed.vcf.gz
touch $out_vcf
minimac4 -o $out_vcf ${g1k_p3}1000g_phase3_v5.chr${chr}.with_parameter_estimates.msav \
${wkdir}/1.2_preinputation_check/qc_3rd-updated-chr${chr}.vcf.gz \
--min-r2 0.3 --min-ratio 1e-6 \
-o $out_vcf
Note: you don't need to use absolute paths in Slurm as long as the directory you call sbatch from is accessible from the compute node.
Hi everyone
The minimac4 run well for some chromosomes, like chr1to10. But reported error from chr11 in merging step:
Here is my code:
chr=11 minimac4 \ 1000g_phase3_v5.chr${chr}.with_parameter_estimates.msav \ 1.2_preinputation_check/qc_3rd-updated-chr${chr}.vcf.gz \ --min-ratio 1e-6 \ --threads 10 \ -o c${chr}.imputed.vcf.gz
Has anyone met the same problem?