rgcgithub / regenie

regenie is a C++ program for whole genome regression modelling of large genome-wide association studies.
https://rgcgithub.github.io/regenie
Other
187 stars 55 forks source link

Conditional analysis in step1 #440

Closed IoannaTach closed 1 year ago

IoannaTach commented 1 year ago

I would like to run a burden test while conditioning on nearby variants. My understanding is that I will need to repeat step1 to add the list of variants to condition on as covariates. As these variants are not in the plink file I use for the kinship matrix in step1, I need to provide another plink file with these variants using command "--condition-file". The code I used is below, but it gives me the error "ERROR: invalid option input for --condition-file"

regenie \ --step 1 \ --bed ${myinputpath}/ForKinship${cohort}${ethnicity}_forIBD \ --covarFile /projects/cgr/workspaces/Continuoustraits/Covariates${cohort}_${ethnicity}.txt \ --phenoFile /projects/cgr/workspaces/Continuoustraits/Phenotypes${cohort}_${ethnicity}.txt \ --phenoColList ${biomarker} \ --covarColList Sex,Age,Batch,PC_Unrelated_1,PC_Unrelated_2,PC_Unrelated_3,PC_Unrelated_4 \ --catCovarList Sex \ --bsize 1000 \ --loocv \ --qt --lowmem \ --condition-list /projects/cgr/workspaces/Continuous_traits/burdenconditionaanalyses${condition}.txt \ --condition-file ${myoutputpath}/${condition} \ --lowmem-prefix ${myoutputpath}/tmp_rg \ --out ${myoutputpath}/fit_bin_out_portfolio

However, I can see that the plibnk files I used in combination with "-condition-file" are there:

ls -alh ${myoutputpath}/${condition}.* -rw-rw---- 1 kvrf354 xem-scp-cgr-workspace-ackdportrenbiomark2023 1.1M Aug 21 15:54 /projects/cgr/workspaces/burden_ANXA9_ConditionedOn_RORC/ANXA9_ConditionedOn_RORC.bed -rw-rw---- 1 kvrf354 xem-scp-cgr-workspace-ackdportrenbiomark2023 378 Aug 21 15:54 /projects/cgr/workspaces/burden_ANXA9_ConditionedOn_RORC/ANXA9_ConditionedOn_RORC.bim -rw-rw---- 1 kvrf354 xem-scp-cgr-workspace-ackdportrenbiomark2023 9.5M Aug 21 15:54 /projects/cgr/workspaces/burden_ANXA9_ConditionedOn_RORC/ANXA9_ConditionedOn_RORC.fam

Any ideas please?

joellembatchou commented 1 year ago

Hi,

Please check the documentation:

image

So your call should have e.g. --condition-file bed,${myoutputpath}/${condition}.

Cheers, Joelle

alyssacl commented 1 year ago

Hi, I am also attempting a conditional analysis. I'm wondering what the format of the conditional file should be? I currently just list the CHR:SNP:A1:A2 I want to condition on in my conditional file (.txt) and the job fails. I tried adding bed, file (below) and get the following error.

Options in effect: --step 2 \ --out snp485991.conditioned484453.dose.assoc.c6 \ --bgen ukb21007_c6_b0_v1.bgen \ --ref-first \ --sample ukb21007_c6_b0_v1.sample \ --chr 6 \ --range 6:475991-495991 \ --condition-list bed,condition.list.WM.txt \ --phenoFile WM_LPL_phenotype_AUG2023.txt \ --covarFile WM_LPL_covariate_AUG2023.txt \ --remove exclude_participants.txt \ --bt \ --approx \ --firth-se \ --firth \ --pred WM38_results_pred.list \ --bsize 1000 \ --pThresh 0.01 \ --minMAC 1 \ --threads 128 \ --gz ERROR: bed,condition.list.WM.txt doesn't exist for option --condition-list

Thanks for your help. Alyssa

joellembatchou commented 1 year ago

Hi Alyssa,

The option used should be condition-list condition.list.WM.txt. If these variants are not in "ukb21007_c6_b0_v1.bgen", then you can use --condition-file to specify the genotype file containing these variants.

Cheers, Joelle