Closed honda-s691470 closed 2 weeks ago
I was able to successfully run the imputation job on the Michigan Imputation Server after removing the line:
plink --bfile ${OUTPUT_DIR}/qc_filtered_data-updated-chr${chr} --real-ref-alleles --recode vcf --out ${OUTPUT_DIR2}/qc_filtered_data-updated-chr${chr}
It appears that the duplicate use of --real-ref-alleles caused an issue in the VCF preparation process, possibly impacting the integrity of the output files. After removing this line, the server processed the job without any issues, and the imputation pipeline proceeded smoothly.
Thank you for the assistance. I am closing this issue as it is now resolved.
I'm encountering a problem with the Michigan Imputation Server (v2.0.6). Although my VCF files pass all checkVCF.py checks without error, the imputation pipeline fails about a minute and a half into the run with the message: "Pipeline execution failed." Is there any indication from this setup that could explain the imputation failure, or would additional steps (such as VCF format adjustments) be recommended?
OUTPUT_DIR="/XXX" mkdir -p $OUTPUT_DIR
plink --tfile $PLINK_INPUT \ --geno 0.02 --mind 0.02 --maf 0.01 --hwe 0.0001 \ --make-bed --out ${OUTPUT_DIR}/qc_filtered_data
plink --bfile ${OUTPUT_DIR}/qc_filtered_data --freq --out ${OUTPUT_DIR}/qc_filtered_data
wget https://www.chg.ox.ac.uk/~wrayner/tools/HRC-1000G-check-bim-v4.3.0.zip -P $OUTPUT_DIR unzip -o $OUTPUT_DIR/HRC-1000G-check-bim-v4.3.0.zip -d $OUTPUT_DIR
perl ${OUTPUT_DIR}/HRC-1000G-check-bim.pl -b ${OUTPUT_DIR}/qc_filtered_data.bim \ -f ${OUTPUT_DIR}/qc_filtered_data.frq \ -r ${OUTPUT_DIR}/HRC.r1-1.GRCh37.wgs.mac5.sites.tab -h cd $OUTPUT_DIR
bash Run-plink.sh
OUTPUT_DIR2="/YYY" for chr in {1..22}; do plink --bfile ${OUTPUT_DIR}/qc_filtered_data-updated-chr${chr} --real-ref-alleles --recode vcf --out ${OUTPUT_DIR2}/qc_filtered_data-updated-chr${chr} bcftools sort ${OUTPUT_DIR2}/qc_filtered_data-updated-chr${chr}.vcf -Oz -o ${OUTPUT_DIR2}/qc_filtered_data-updated-chr${chr}.vcf.gz bcftools index ${OUTPUT_DIR2}/qc_filtered_data-updated-chr${chr}.vcf.gz done
CheckVCF Results The checkVCF.py output for each chromosome shows no critical errors: $ ./checkVCF.sh checkVCF.py -- check validity of VCF file for meta-analysis version 1.4 (20140115) contact zhanxw@umich.edu or dajiang@umich.edu for problems. Python version is [ 2.7.17.final.0 ] Begin checking vcfFile [ ./qc_filtered_data-updated-chr1.vcf.gz ] [ 10000 ] lines processed [ 20000 ] lines processed [ 30000 ] lines processed [ 40000 ] lines processed --------------- REPORT --------------- Total [ 49065 ] lines processed Examine [ 7 ] VCF header lines, [ 49058 ] variant sites, [ 563 ] samples [ 0 ] duplicated sites [ 0 ] NonSNP site are outputted to [ ./check_chr1.log.check.nonSnp ] [ 0 ] Inconsistent reference sites are outputted to [ ./check_chr1.log.check.ref ] [ 0 ] Variant sites with invalid genotypes are outputted to [ ./check_chr1.log.check.geno ] [ 7405 ] Alternative allele frequency > 0.5 sites are outputted to [ ./check_chr1.log.check.af ] [ 0 ] Monomorphic sites are outputted to [ ./check_chr1.log.check.mono ] --------------- ACTION ITEM --------------- No error found by checkVCF.py, thank you for cleanning VCF file. Upload these files to the ftp server (so we can double check): ./check_chr1.log.check.log ./check_chr1.log.check.dup ./check_chr1.log.check.noSnp ./check_chr1.log.check.ref ./check_chr1.log.check.geno ./check_chr1.log.check.af ./check_chr1.log.check.mono
Results of monitor $ ./monitor_impute.sh {"application":"Genotype Imputation 2.0.6","applicationId":"imputationserver2","deletedOn":-1,"endTime":1730886855536,"id":"job-20241106-045256-806","name":"job-20241106-045256-806","steps":[{"id":45834047,"name":"Input Validation","empty":false,"logMessages":[]},{"id":45834048,"name":"Quality Control","empty":true,"logMessages":[]},{"id":45834049,"name":"Phasing and Imputation","empty":true,"logMessages":[]},{"id":45834050,"name":"Summary","empty":false,"logMessages":[{"message":"Pipeline execution failed.","success":true,"time":1730886855562,"type":1},{"message":"Imputation job failed.","success":true,"time":1730886855564,"type":1}]}],"state":5,"positionInQueue":-1,"userAgent":"curl/7.61.1","startTime":1730886778171,"submittedOn":1730886778124,"outputParams":[{"id":3703838,"name":"output","files":[],"description":"Downloads","value":"","type":null,"download":true,"tree":[],"job":null,"jobId":"job-20241106-045256-806","autoExport":false,"hash":"FkZr16cS3eIhcA87H57762vfVl7CaoAnqWlEuXgi"}],"username":"shonda","logs":"","currentTime":1730888613370,"workspaceSize":null}[