omerwe / polyfun

PolyFun (POLYgenic FUNctionally-informed fine-mapping)
MIT License
85 stars 21 forks source link

polypred.py: error: unrecognized arguments #109

Closed mkoromina closed 2 years ago

mkoromina commented 2 years ago

Hi @omerwe,

This is somehow a follow up from my comment in issue #108. However, I decided to post on this separately, purely to keep things nice and neat. I am executing step 3 of the wiki:

python /path/to/polypred.py \
    --combine-betas \
    --betas /path/to/other/method/effect_sizes/stats.gz, /path/to/polyfun.agg.txt.gz \
    --pheno /path/to/pheno.txt  \
    --output-prefix polypred_combined_effects \
    --plink-exe ~/plink/plink \
    /path/to/cohort.bed

However, this crashes with the following error: polypred.py unrecognized argument: /path/to/cohort.bed It seems to me that it is something very simple that could easily be fixed, but you could kindly let me know what could be wrong here?

With my best wishes, Maria

omerwe commented 2 years ago

Hi @mkoromina,

I think this could be because you have a space after the comma in the argument of --betas. Can you please retry without the space?

mkoromina commented 2 years ago

Hi @omerwe ,

Many thanks for your quick reply. I tried this but then I get this error message: polypred.py: error: argument --plink-exe: expected one argument. Any ideas on what could be wrong here? I think it is not happy with having two arguments straight after --plink-exe flag. So, the .bed file is mistaken as as an argument for plink rather than a separate argument?

Thanks a lot in advance, Maria

mkoromina commented 2 years ago

Brief update: it seems that there was an issue, since I was trying to create a loop and iterate over a couple of different cohort stats (in the --betas argument), .fam files and .bed files. As soon as I subsetted to one cohort to be tested, then that issue was gone. However, there is a problem with my--phenoflag, as I am listing a .fam file which has no header. Is it possible to amend the script somehow to accept .fam files or shall we edit them accordingly to have headers?

Many thanks Omer!

omerwe commented 2 years ago

@mkoromina, can you please post side-by-side the exact command that does work for you (as you copy-paste it from the wiki) and the one that doesn't?

mkoromina commented 2 years ago

Sure @omerwe !

So, the one that did not work for me:

for i in cohort1 cohort2 cohort3 cohort4  #name of cohorts for which I wish to run polypred on
do
python /path/to/polypred.py \
    --combine-betas \
    --betas /path/to/LOO/no"$i".gz,/path/to/polyfun.agg.txt.gz \     #using LOO stats for each cohort I wish to iterate the command on and polyfun ones
    --pheno /path/to/"$i".fam \
    --output-prefix /path/to/output/polypred \
    --plink-exe ~/plink \
        /path/to/"$i".bed
done

And the one that did work but gave an error for the .fam file in the --pheno flag (it has no header so it cannot locate the FID field):

python /path/to/polypred.py \
    --combine-betas \
    --betas /path/to/LOO/no_cohort1.gz,/path/to/polyfun.agg.txt.gz \
    --pheno /path/to/cohort1.fam \
    --output-prefix /path/to/output/polypred \
    --plink-exe ~/plink \
        /path/to/cohort1.bed

If there is any other information that I need to provide, please do let me know. Many thanks, Maria

omerwe commented 2 years ago

@mkoromina, I'd rather require headers in the pheno files to avoid ambiguities in the column definition. You can easily add your own headers.

I'm pretty baffled by why the loop fails for you while the individual command doesn't. I suspect it's somehow related to how bash processes for loops, rather than to PolyFun itself. If you only have four cohorts, I guess you could just copy-paste the same command 4 times and change some fields each time to point to the correct cohort.

If you want a more systematic approach, you could try adding echo before the command python (inside the loop) to see the command that bash is trying to run in each loop iteration (i.e. echo python instead of python). Then you could copy-paste the command that bash is trying to run and try to run it manually. This will hopefully help you figure out what's going on in the loop. I hope it's clear, please let me know if not!

mkoromina commented 2 years ago

Hi @omerwe , Many thanks for this! Will try your recommendation and come back to you. Would it be okay to send you in your email a few questions about the theoretical background behind PolyPred? If not, then no worries: I am quite happy to post them in this thread!

Thanks once again, Maria

omerwe commented 2 years ago

@mkoromina sure you can send an email to oweissbrod@hsph.harvard.edu.