transbioZI / Gimpute

An efficient genetic data imputation pipeline
11 stars 4 forks source link

Xpar option missing for IMPUTE4 #3

Open eyherabh opened 4 years ago

eyherabh commented 4 years ago

The file R/phasingImpute4.R has the comment

## impute for chrX PAR >> with an additional flag: --Xpar.

However, the lines following it and running IMPUTE4 do not have that flag set, i.e.

167:                ## impute for chrX PAR >> with an additional flag: --Xpar.
168-                system(paste0(impute4,   
169-                " -no_maf_align \ ",   
170-                " -m ", GENMAP_FILE, " \ ",  
171-                " -h ", HAPS.chrXPAR1, " \ ", 
172-                " -l ", LEGEND.chrXPAR1, " \ ", 
173-                " -g ", GWAS_HAPS_FILE, " \ ", 
174-                " -Ne ", effectiveSize, " \ ", 
175-                " -int ", chunkSTART, " ", chunkEND, " \ ", 
176-                " -buffer 1000  \ ",
177-                " -o ", OUTPUT_FILE, " \ " ))
178-            } else if (i == "X_PAR2") {  

Could you please clarify whether the effectiveSize is being reduced elsewhere in order to mimic the effect of the missing flag as per IMPUTE2? Thanks

Junfang commented 4 years ago

Hi Eyherabh,

Sorry that line was commented out. As IMPUTE4 is not intended to do imputation for Chromosome X, we do not have the flag '--Xpar' when running IMPUTE4. You can also see the usage of IMPUTE4: https://www.dropbox.com/sh/k6b34fzw9w4s8bg/AAA65aF5l2oj_AT9iDLgKCv9a?dl=0&preview=impute4.1.2_usage.docx

Regarding the effective size, you would check the reference stated in IMPUTE2: https://mathgen.stats.ox.ac.uk/impute/impute_v2.html "Effective size" of the population (commonly denoted as Ne in the population genetics literature) from which your dataset was sampled. This parameter scales the recombination rates that IMPUTE2 uses to guide its model of linkage disequilibrium patterns. When most imputation runs were conducted with reference panels from HapMap Phase 2, we suggested values of 11418 for imputation from HapMap CEU, 17469 for YRI, and 14269 for CHB+JPT."

Best, Junfang

eyherabh commented 4 years ago

Thanks. I know that the impute4 manual does not list the -chrX flag. However, the manual does not inspire too much confidence since both it has typos that render the example there non-working, and the flag -chrX is acknowledge when passed to impute4. The publication where impute4 was introduced (https://www.nature.com/articles/s41586-018-0579-z) does not clarify matters further really. The methods section states that PAR and non-PAR regions were, after some filtering, processed analogously to the autosomal regions, which were imputed with impute4. The manual for impute2 does not clarify whether -Xpar should be used or not in conjunction with -chrX. I though it was the former, but I have actually found no diffference between using it without or without -chrX or not using it at all, expect for a note warning about known haplotypes not being changed.