Joyvalley / GWAS_Flow

GPU accelerated GWAS framework based on TensorFlow
MIT License
34 stars 14 forks source link

Error with GWAS_Flow example files #26

Closed vinod1981 closed 2 years ago

vinod1981 commented 2 years ago

Hi, I've tried to run the example files from GWAS_Flow but it is throwing error:

python /vol/cluster-data/vkumar/miniconda3/bin/GWAS_Flow/gwas.py -x gwas_sample_data/my_plink.plink -y gwas_sample_data/pheno2.csv -k gwas_sample_data/kinship_ibs_binary_mac5.h5py -o results_test.csv                      
parsed commandline arguments
Mapping files: 100%|#####################################################################################################################################################################################################################################################################################################################################################################################| 3/3 [00:00<00:00,  9.54it/s]
WARNING: accessions in X do not overlap with accessions in Y
Accessions X:
['9381' '9380' '9378' ... '9384' '9383' '9382']
Accessions Y:
[  5837   6008   6009   6016   6040   6042   6043   6046   6064   6074
   6243   6709   6897   6898   6899   6900   6901   6903   6904   6905
   6906   6907   6908   6909   6910   6911   6913   6914   6915   6916
   6917   6918   6919   6920   6921   6922   6923   6924   6926   6927
   6928   6929   6930   6931   6932   6933   6936   6937   6939   6940
   6942   6943   6944   6945   6946   6951   6956   6958   6959   6960
   6961   6962   6963   6964   6965   6966   6967   6968   6969   6970
   6971   6972   6973   6974   6975   6976   6977   6978   6979   6980
   6981   6982   6983   6984   6985   6988   7000   7014   7033   7062
   7064   7081   7094   7123   7147   7163   7231   7255   7275   7282
   7296   7306   7323   7346   7418   7424   7438   7460   7461   7477
   7514   7515   7516   7517   7518   7519   7520   7521   7522   7523
   7524   7525   7526   8213   8214   8215   8222   8230   8231   8233
   8235   8236   8237   8239   8240   8241   8242   8243   8245   8247
   8249   8254   8256   8258   8259   8264   8265   8266   8270   8271
   8274   8275   8283   8284   8285   8290   8296   8297   8300   8306
   8310   8311   8312   8313   8314   8323   8325   8326   8329   8334
   8335   8337   8343   8351   8353   8354   8365   8369   8374   8376
   8378   8387   8388   8389   8395   8420   8422   8423   8424   8426
   9057   9058 100000]
WARNING: not all accessions are in the kinship matrix
Accessions X/Y:
[]
Accessions K:
[  88  108  139 ... 7462 6401 1864]
data has been imported
Begin performing GWAS on  gwas_sample_data/pheno2.csv
/vol/cluster-data/vkumar/miniconda3/bin/GWAS_Flow/gwas_flow/main.py:200: RuntimeWarning: divide by zero encountered in double_scalars
  * kin_vr) * kin_vr
Traceback (most recent call last):
  File "/vol/cluster-data/vkumar/miniconda3/bin/GWAS_Flow/gwas.py", line 94, in <module>
    output = main.gwas(X, K, Y_, BATCH_SIZE, COF)
  File "/vol/cluster-data/vkumar/miniconda3/bin/GWAS_Flow/gwas_flow/main.py", line 287, in gwas
    v_g, delta, v_e = get_herit(y_phe, k_stand)
  File "/vol/cluster-data/vkumar/miniconda3/bin/GWAS_Flow/gwas_flow/main.py", line 205, in get_herit
    return estimate_variance_components(y_phe, "normal", k_stand, verbose=False)
  File "/vol/cluster-data/vkumar/miniconda3/bin/GWAS_Flow/gwas_flow/herit.py", line 24, in estimate_variance_components
    q_s = economic_qs(kin)
  File "/vol/cluster-data/vkumar/miniconda3/envs/gwas_flow_test5/lib/python3.7/site-packages/numpy_sugar/linalg/qs.py", line 24, in economic_qs
    nok = abs(max(Q[0].min(), Q[0].max(), key=abs)) < epsilon
IndexError: index 0 is out of bounds for axis 0 with size 0

Please look into it. Also It is a gemma based tool but still it is not taking the the kinship generated from gemma? I tried it in various ways but could not run it. Thanks, Vinod,

iimog commented 2 years ago

Hi Vinod, sorry again for the delayed response. The error using this example data happens because the plink input and the pheno2.csv sample data don't belong together. As you can see by the warnings they have no accessions in common. So after using only accessions that have both a genotype and a phenotype available, there is no accession left which ultimately leads to the divide by 0. As mentioned in #16 a complete re-write of GWAS_Flow is being developed. Therefore development of GWAS_Flow is now limited to fixing bugs. We hope to provide even better support for different file formats with the new tool.