Cloufield / gwaslab

A Python package for handling and visualizing GWAS summary statistics. https://cloufield.github.io/gwaslab/
GNU General Public License v3.0
151 stars 25 forks source link

Error with .plot_daf() function #49

Closed swvanderlaan closed 1 year ago

swvanderlaan commented 1 year ago

I applied the .plot_daf() function:

sumstats_sample.check_af(
    ref_infer=gl.get_path("1kg_eur_hg19"),
    ref_alt_freq="AF",
    n_cores=8,
)

The log is like this:

Fri Aug 11 16:02:04 2023 Start to check the difference between EAF and refence vcf alt frequency ...
Fri Aug 11 16:02:04 2023  -Current Dataframe shape : 99998  x  23
Fri Aug 11 16:02:04 2023  -Reference vcf file: /Users/username/.gwaslab/EUR.ALL.split_norm_af.1kgp3v5.hg19.vcf.gz
Fri Aug 11 16:02:04 2023  -CPU Cores to use : 8
Fri Aug 11 16:02:04 2023  -Checking prefix for chromosomes in vcf files...
Fri Aug 11 16:02:05 2023  -No prefix for chromosomes in the VCF files.
Fri Aug 11 16:02:05 2023  -Alternative allele frequency in INFO: AF
Fri Aug 11 16:02:05 2023  -Checking variants: 99998
Fri Aug 11 16:02:28 2023  - DAF min: 0.7962059881538153
Fri Aug 11 16:02:28 2023  - DAF max: -0.8034360408782959
Fri Aug 11 16:02:28 2023  - DAF sd: 0.06905778226683758
Fri Aug 11 16:02:28 2023  - abs(DAF) min: 3.948807716369629e-07
Fri Aug 11 16:02:28 2023  - abs(DAF) max: 0.8034360408782959
Fri Aug 11 16:02:28 2023  - abs(DAF) sd: 0.05518578187132032
Fri Aug 11 16:02:28 2023 Finished allele frequency checking!

My data now looks like this:

    SNPID   rsID    CHR POS EA  NEA EAF BETA    SE  P   ... MinFreq MaxFreq HetISq  HetChiSq    HetDf   HetPval ncohorts    AAsum   EAsum   DAF
0   1:745021    rs2427898   1   745021  T   G   0.2160  -0.1511 0.0916  0.09897 ... 0.0181  0.3244  31.0    8.691   6   0.19170 7   4   3   0.168286
1   1:854250    rs7537756   1   854250  G   A   0.2229  -0.0631 0.0397  0.11190 ... 0.7379  0.7980  25.3    5.357   4   0.25260 5   3   2   0.030057
2   1:887560    rs3748595   1   887560  C   A   0.8256  0.0263  0.0468  0.57400 ... 0.1289  0.2192  0.0 5.052   9   0.82980 10  5   5   -0.122710
3   1:907609    rs79890672  1   907609  C   T   0.0376  0.0588  0.1206  0.62590 ... 0.9401  0.9847  19.2    6.190   5   0.28810 6   4   2   0.022689
4   1:909326    rs61573829  1   909326  T   C   0.1827  0.1060  0.0796  0.18280 ... 0.0114  0.1919  60.5    10.135  4   0.03821 5   4   1   0.178724
... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ... ...
99993   22:51098793 rs145709293 22  51098793    T   G   0.0332  0.0606  0.0505  0.23040 ... 0.0154  0.0451  0.0 14.397  20  0.80980 21  7   14  -0.004573
99994   22:51120558 rs138163275 22  51120558    G   A   0.1836  -0.0256 0.0318  0.42180 ... 0.7208  0.9283  2.2 17.375  17  0.42920 18  7   11  0.108053
99995   22:51134387 rs77452243  22  51134387    A   G   0.0366  -0.1097 0.0711  0.12290 ... 0.0104  0.0451  0.0 8.469   12  0.74750 13  2   11  -0.004155
99996   22:51156666 rs9628187   22  51156666    T   C   0.2008  -0.0454 0.0258  0.07862 ... 0.1077  0.2160  0.0 13.017  16  0.67150 17  6   11  0.000005
99997   22:51187063 rs75503428  22  51187063    T   A   0.0934  0.2468  0.1254  0.04906 ... 0.8604  0.9415  0.0 4.328   5   0.50330 6   2   4   NaN

Next I try to use the function

sumstats_sample.plot_daf(threshold=0.12)

But I get this error:

---------------------------------------------------------------------------
UnboundLocalError                         Traceback (most recent call last)
Cell In[152], line 1
----> 1 sumstats_sample.plot_daf(threshold=0.12)
...
---> 56 if saveargs is None:
     57     if save_args is None:
     58         saveargs = save_args = {}

UnboundLocalError: local variable 'saveargs' referenced before assignment

Any quick fix to solve this?

Cloufield commented 1 year ago

Hi, sorry for the error. I think I have fixed this and you can try the latest version 3.4.22. image

swvanderlaan commented 1 year ago

I will check this out and report back.

swvanderlaan commented 1 year ago

This works now (am on 3.4.24 currently)