omerwe / polyfun

PolyFun (POLYgenic FUNctionally-informed fine-mapping)
MIT License
89 stars 22 forks source link

AttributeError: 'DataFrame' object has no attribute 'ukb' #53

Closed jdblischak closed 3 years ago

jdblischak commented 3 years ago

I am attempting to run step 3. Compute LD-scores for each SNP bin of PolyFun approach 3: Computing prior causal probabilities non-parametrically. My code is essentially the same as the example. The only difference is I have added the flag --allow-missing

mkdir -p LD_cache
python polyfun.py \
    --compute-ldscores \
    --output-prefix output/testrun \
    --ld-ukb \
    --ld-dir LD_cache \
    --chr 1 \
    --allow-missing

When I run this code with the current HEAD commit df5c0723cb4efb892cb3d0d4df72efa113fbc5a6, I get the following error message:

Traceback (most recent call last):
  File "bin/polyfun/polyfun.py", line 848, in <module>
    polyfun_obj.polyfun_main(args)
  File "bin/polyfun/polyfun.py", line 776, in polyfun_main
    self.compute_ld_scores(args)
  File "bin/polyfun/polyfun.py", line 637, in compute_ld_scores
    df_ldscores_chr = compute_ldscores_chr(df_bins_chr, ld_dir)
  File "/redacted/bin/polyfun/compute_ldscores_from_ld.py", line 193, in compute_ldscores_chr
    if args.ukb:
  File "/redacted/mambaforge/envs/polyfun/lib/python3.7/site-packages/pandas/core/generic.py", line 5179, in __getattr__
    return object.__getattribute__(self, name)
AttributeError: 'DataFrame' object has no attribute 'ukb'

The issue appears to be that polyfun.py passes a pandas DataFrame:

https://github.com/omerwe/polyfun/blob/df5c0723cb4efb892cb3d0d4df72efa113fbc5a6/polyfun.py#L637

when compute_ldscores_chr() expects args to be the first argument:

https://github.com/omerwe/polyfun/blob/df5c0723cb4efb892cb3d0d4df72efa113fbc5a6/compute_ldscores_from_ld.py#L190

I was going to attempt to fix this by passing args to compute_ldscores_chr() and replacing args.ukb with args.ld_ukb. However, I noticed that args.ukb is valid when invoking compute_ldscores_from_ld.py directly, as documented in Computing LD-scores with pre-computed UK Biobank LD matrices.

https://github.com/omerwe/polyfun/blob/df5c0723cb4efb892cb3d0d4df72efa113fbc5a6/compute_ldscores_from_ld.py#L302

Thus it's unclear to me how best to update compute_ldscores_chr() to support both of these use cases. In case it could be helpful, I traced the function signature change from compute_ldscores_chr(df_annot_chr, ld_dir, no_cache=False) to compute_ldscores_chr(args, df_annot_chr) in commit b6674ce4e800fd9188d9dfe3ce9727a7926547c4, which is the same commit where compute_ldscores_ukb.py was renamed to compute_ldscores_from_ld.py.

jdblischak commented 3 years ago

Update: I was able run this step using the code from commit 5ce89e09e242fc436fb4215ab02975c40a282110

omerwe commented 3 years ago

Thanks John, great catch! (and crazy coincidence...). I fixed the bug, can you please confirm that it's fixed (and reopen if not)?

jdblischak commented 3 years ago

Confirmed. Thanks for the quick fix!

jdblischak commented 3 years ago

For future reference, bug fixed in d85bd8b26e2fd5b9172b71cfe46f87cd8b1c0ae3