error: Error in getopt(spec = spec, opt = args) : long flag "bedFile" is invalid

lucavd commented 1 year ago

Hi, I've seen this issue posted in the former repo with no reply. Here's my code and error. Any idea how to solve it? Tha same path for step1 is working fine. Thank you

(base) luca@UP-031:~/GWAS$ docker run -v $(pwd):$(pwd) -w $(pwd) -i -t wzhou88/saige.survival:0.45 step2_SPAtests.R \
        --bedFile=/GWAS/UBEP_GWAS/data_raw/omni_mind_geno_clean3_maf_0_05.bed \
        --bimFile=/GWAS/UBEP_GWAS/data_raw/omni_mind_geno_clean3_maf_0_05.bim \
        --famFile=/GWAS/UBEP_GWAS/data_raw/omni_mind_geno_clean3_maf_0_05.fam \
        --SAIGEOutputFile=./UBEP_GWAS/data_raw/omni_mind_geno_clean3_maf_0_05_RES.txt \
        --GMMATmodelFile=./UBEP_GWAS/data_raw/GWAS_2023_GC_PC1_12_Frame_0_05.rda \
        --varianceRatioFile=./UBEP_GWAS/data_raw/GWAS_2023_GC_PC1_12_Frame_0_05.varianceRatio.txt
R version 3.6.3 (2020-02-29)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS

Matrix products: default
BLAS:   /usr/lib/x86_64-linux-gnu/openblas-pthread/libblas.so.3
LAPACK: /usr/lib/x86_64-linux-gnu/openblas-pthread/liblapack.so.3

locale:
 [1] LC_CTYPE=C.UTF-8       LC_NUMERIC=C           LC_TIME=C.UTF-8
 [4] LC_COLLATE=C.UTF-8     LC_MONETARY=C.UTF-8    LC_MESSAGES=C.UTF-8
 [7] LC_PAPER=C.UTF-8       LC_NAME=C              LC_ADDRESS=C
[10] LC_TELEPHONE=C         LC_MEASUREMENT=C.UTF-8 LC_IDENTIFICATION=C

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base

other attached packages:
[1] GATE_0.45

loaded via a namespace (and not attached):
[1] compiler_3.6.3     Matrix_1.5-1       Rcpp_1.0.7         grid_3.6.3
[5] RcppParallel_5.1.5 lattice_0.20-40
Usage: /usr/local/bin/step2_SPAtests.R [options]

**/usr/local/bin/step2_SPAtests.R: error: Error in getopt(spec = spec, opt = args) : long flag "bedFile" is invalid**

kscott-1 commented 1 year ago

Hi @lucavd, it seems you are using a docker container from the old unstable repo SAIGE instead of this repo, which has the stable release chain.

Previously, as per https://github.com/weizhouUMICH/SAIGE/blob/a41727267cb5f843a4446e4d4809cafc72687a5d/extdata/step2_SPAtests.R#L19 there was no support in step 2 for plink files as input files. That support now exists in the current stable release.

Until you update your container to the stable version, you will continue to get that issue.

I suggest you check out #100 which is my PR with docker updates, or you can create a new container from the repo as is.

Good luck!

lucavd commented 1 year ago

Thank you @kscott-1

I'm kind of aware of it since this is the docker image of the SAIGE-GATE (last version of the image 0.45) and I noticed that it was an old version of step two.

Do you recommend to transform them in another format prior to step 2?

L

kscott-1 commented 1 year ago

If you are set on using your current docker set up, I would convert your input files to vcfs. I wouldn't recommend that though, since it appears lots of updates have happened to SAIGE since 0.45. I can't speak too much on that since I'm not the developer or the maintainer (@weizhouUMICH), but the best choice (I assume) would be updating your container to 1.x and using your input files as is.

lucavd commented 1 year ago

@kscott-1 @weizhouUMICH @weizhou0

I opened a cross issue in GATE. https://github.com/weizhou0/GATE/issues/22

I tried to use step_1 in GATE:0.45 (works) and step_2 in SAIGE:1.2.0 since it should be the same as in GATE but with no luck.

I don't understand (sorry, my bad) the rationale of asking plink files in step1 but other format in step2 that depends from step1 (even in the SAIGE former version)

kscott-1 commented 1 year ago

It's unlikely you'll be using the same input files in step 1 and step 2. Step 1 is fitting a Null GLMM and requires a low number of independent (pruned) variants. Step 2 is for the actual association analysis, which is likely to be a much larger file.

lucavd commented 1 year ago

Thank you! I got it working by converting the plink files to vcf and its index file but your reply got me a doubt on how I calculated the step_1. Probably is a naive question but for the NULL model, should I randomly select patients/chromosome/variants as indicated as optional here also for binary traits?

i.e., I don't have to use the plink files with the full dataset in step1 but only a randomly selected subset, correct? If yes, how do you suggest to do it?

PS: I promise that when all works I'll make a foolproof wiki and give it to you...

kscott-1 commented 1 year ago

I'd suggest taking a look at this issue https://github.com/weizhouUMICH/SAIGE/issues/92 from a while back. I hope that gives you the answer you're looking for.

alldayscientess commented 1 year ago

Hello @kscott-1 , I'm trying to use plink files for STEP2 and saw your note about this being possible in the stable version of SAIGE.

"Hi @lucavd, it seems you are using a docker container from the old unstable repo SAIGE instead of this repo, which has the stable release chain."

I opened the step2_SPAtests.R file and there is no reference to plink files. Please correct me if I completely missed something. I hope that's true.

kscott-1 commented 1 year ago

Hello @kscott-1 , I'm trying to use plink files for STEP2 and saw your note about this being possible in the stable version of SAIGE.

"Hi @lucavd, it seems you are using a docker container from the old unstable repo SAIGE instead of this repo, which has the stable release chain."

I opened the step2_SPAtests.R file and there is no reference to plink files. Please correct me if I completely missed something. I hope that's true.

The files in this repo (saigegit) are updated and stable. You have linked the files from the old unstable repo as well.

You are looking for this file: step2_SPAtests.R

alldayscientess commented 1 year ago

Thank you @kscott-1 I see it now! (I also opened a new issue on this I will close it with what you said).

saigegit / SAIGE

error: Error in getopt(spec = spec, opt = args) : long flag "bedFile" is invalid #111