AstraZeneca / chemicalx

A PyTorch and TorchDrug based deep learning library for drug pair scoring. (KDD 2022)
https://chemicalx.readthedocs.io
Apache License 2.0
700 stars 89 forks source link

KeyError: 'node_feature' from running deepdds_example.py #100

Closed cshukai closed 1 year ago

cshukai commented 2 years ago

Please see the following log.

 File "deepdds.py", line 27, in <module>
    main()
  File "deepdds.py", line 14, in main
    results = pipeline(
  File "/storage/htc/nih-tcga/sc724/conda/synergy/lib/python3.8/site-packages/chemicalx/pipeline                                                        .py", line 155, in pipeline
    prediction = model(*model.unpack(batch))
  File "/storage/htc/nih-tcga/sc724/conda/synergy/lib/python3.8/site-packages/torch/nn/modules/m                                                        odule.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/storage/htc/nih-tcga/sc724/conda/synergy/lib/python3.8/site-packages/chemicalx/models/d                                                        eepdds.py", line 176, in forward
    features_left = self._forward_molecules(molecules_left)
  File "/storage/htc/nih-tcga/sc724/conda/synergy/lib/python3.8/site-packages/chemicalx/models/d                                                        eepdds.py", line 158, in _forward_molecules
    features = self.drug_conv(molecules, molecules.data_dict["node_feature"])["node_feature"]
KeyError: 'node_feature'
kajocina commented 2 years ago

Hi, I just re-ran the example code with chemicalx 0.1.0 and had no such issue.

Could you provide a bit more details? How are you running the example?

cshukai commented 2 years ago

I copied and pasted the python script and ran the script like python deepdds.py. I didn't run into this issue when I ran the example for deepsynergy. The version of chemicalx is also 0.10 . I just cloned the repository and ran deepdds example . But still the same issue.

python deepdds_example.py
  0%|                                                                    | 0/10 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "deepdds_example.py", line 27, in <module>
    main()
  File "deepdds_example.py", line 14, in main
    results = pipeline(
  File "/storage/htc/nih-tcga/sc724/conda/synergy/lib/python3.8/site-packages/chemicalx/pipeline.py", line 155, in pipeline
    prediction = model(*model.unpack(batch))
  File "/storage/htc/nih-tcga/sc724/conda/synergy/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/storage/htc/nih-tcga/sc724/conda/synergy/lib/python3.8/site-packages/chemicalx/models/deepdds.py", line 176, in forward
    features_left = self._forward_molecules(molecules_left)
  File "/storage/htc/nih-tcga/sc724/conda/synergy/lib/python3.8/site-packages/chemicalx/models/deepdds.py", line 158, in _forward_molecules
    features = self.drug_conv(molecules, molecules.data_dict["node_feature"])["node_feature"]
KeyError: 'node_feature'
cshukai commented 2 years ago

Thanks so much for the help! I added the following code block at the top of example script for DeepDDS: import torch print(f"running with device: {torch.cuda.get_device_name(torch.cuda.current_device())}")

But still the same issue :

running with device: Tesla V100S-PCIE-32GB
  0%|                                                                                                                            | 0/10 [00:00<?, ?it/s]
Traceback (most recent call last):
  File "deepdds_example.py", line 30, in <module>
    main()
  File "deepdds_example.py", line 17, in main
    results = pipeline(
  File "/storage/htc/nih-tcga/sc724/conda/synergy/lib/python3.8/site-packages/chemicalx/pipeline.py", line 155, in pipeline
    prediction = model(*model.unpack(batch))
  File "/storage/htc/nih-tcga/sc724/conda/synergy/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl
    return forward_call(*input, **kwargs)
  File "/storage/htc/nih-tcga/sc724/conda/synergy/lib/python3.8/site-packages/chemicalx/models/deepdds.py", line 176, in forward
    features_left = self._forward_molecules(molecules_left)
  File "/storage/htc/nih-tcga/sc724/conda/synergy/lib/python3.8/site-packages/chemicalx/models/deepdds.py", line 158, in _forward_molecules
    features = self.drug_conv(molecules, molecules.data_dict["node_feature"])["node_feature"]
KeyError: 'node_feature'
But I can run the deepsynerg without this issue `python deepsynergy_example.py 100% █████████████████████████████████████████████████████████████████████████████████████████████████████████████████ 100/100 [01:42<00:00, 1.02s/it] Metric Value

roc_auc 0.834909 `

kajocina commented 2 years ago

It's not surprising that deepsynergy worked for you given that it has a way simpler architecture and doesn't use the same layer as deepdds.

Could add the details of your environment? What versions of packages do you have in your env? What operating system is it exactly?

cshukai commented 2 years ago

Operating system : cat /etc/*-release CentOS Linux release 7.9.2009 (Core) NAME="CentOS Linux" VERSION="7 (Core)" ID="centos" ID_LIKE="rhel fedora" VERSION_ID="7" PRETTY_NAME="CentOS Linux 7 (Core)" ANSI_COLOR="0;31" CPE_NAME="cpe:/o:centos:centos:7" HOME_URL="https://www.centos.org/" BUG_REPORT_URL="https://bugs.centos.org/"

CENTOS_MANTISBT_PROJECT="CentOS-7" CENTOS_MANTISBT_PROJECT_VERSION="7" REDHAT_SUPPORT_PRODUCT="centos" REDHAT_SUPPORT_PRODUCT_VERSION="7"

CentOS Linux release 7.9.2009 (Core)

  1. Package version (I am using miniconda) conda list -n synergy packages in environment at /storage/htc/nih-tcga/sc724/conda/synergy:

Name Version Build Channel _libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 2_gnu conda-forge alsa-lib 1.2.6.1 h7f98852_0 conda-forge attr 2.5.1 h166bdaf_0 conda-forge blas 1.0 mkl boost 1.74.0 py38h2b96118_5 conda-forge boost-cpp 1.74.0 h75c5d50_8 conda-forge brotli 1.0.9 h166bdaf_7 conda-forge brotli-bin 1.0.9 h166bdaf_7 conda-forge brotlipy 0.7.0 py38h0a891b7_1004 conda-forge bzip2 1.0.8 h7f98852_4 conda-forge ca-certificates 2022.6.15 ha878542_0 conda-forge cairo 1.16.0 ha61ee94_1011 conda-forge certifi 2022.6.15 py38h578d9bd_0 conda-forge cffi 1.15.0 py38h3931269_0 conda-forge charset-normalizer 2.0.12 pyhd8ed1ab_0 conda-forge chemicalx 0.1.0 pypi_0 pypi class-resolver 0.3.10 pypi_0 pypi click 8.1.3 pypi_0 pypi colorama 0.4.4 pyh9f0ad1d_0 conda-forge cryptography 37.0.1 py38h9ce1e76_0 cudatoolkit 10.2.89 h713d32c_10 conda-forge cycler 0.11.0 pyhd8ed1ab_0 conda-forge dbus 1.13.6 h5008d03_3 conda-forge decorator 5.1.1 pyhd8ed1ab_0 conda-forge expat 2.4.8 h27087fc_0 conda-forge ffmpeg 4.3 hf484d3e_0 pytorch fftw 3.3.10 nompi_h77c792f_102 conda-forge font-ttf-dejavu-sans-mono 2.37 hab24e00_0 conda-forge font-ttf-inconsolata 3.000 h77eed37_0 conda-forge font-ttf-source-code-pro 2.038 h77eed37_0 conda-forge font-ttf-ubuntu 0.83 hab24e00_0 conda-forge fontconfig 2.14.0 h8e229c2_0 conda-forge fonts-conda-ecosystem 1 0 conda-forge fonts-conda-forge 1 0 conda-forge fonttools 4.33.3 py38h0a891b7_0 conda-forge freetype 2.10.4 h0708190_1 conda-forge fuzzywuzzy 0.18.0 pypi_0 pypi gettext 0.19.8.1 h73d1719_1008 conda-forge giflib 5.2.1 h36c2ea0_2 conda-forge gmp 6.2.1 h58526e2_0 conda-forge gnutls 3.6.13 h85f3911_1 conda-forge greenlet 1.1.2 py38hfa26641_2 conda-forge gst-plugins-base 1.20.2 hf6a322e_1 conda-forge gstreamer 1.20.2 hd4edc92_1 conda-forge icu 70.1 h27087fc_0 conda-forge idna 3.3 pyhd8ed1ab_0 conda-forge intel-openmp 2021.4.0 h06a4308_3561 jack 1.9.18 h8c3723f_1002 conda-forge jinja2 3.1.2 pyhd8ed1ab_1 conda-forge joblib 1.1.0 pypi_0 pypi jpeg 9e h166bdaf_1 conda-forge keras 2.9.0 pyhd8ed1ab_0 conda-forge keyutils 1.6.1 h166bdaf_0 conda-forge kiwisolver 1.4.3 py38h43d8883_0 conda-forge krb5 1.19.3 h3790be6_0 conda-forge lame 3.100 h7f98852_1001 conda-forge lcms2 2.12 hddcbb42_0 conda-forge ld_impl_linux-64 2.36.1 hea4e1c9_2 conda-forge lerc 3.0 h9c3ff4c_0 conda-forge libbrotlicommon 1.0.9 h166bdaf_7 conda-forge libbrotlidec 1.0.9 h166bdaf_7 conda-forge libbrotlienc 1.0.9 h166bdaf_7 conda-forge libcap 2.64 ha37c62d_0 conda-forge libclang 14.0.5 default_h2e3cab8_0 conda-forge libclang13 14.0.5 default_h3a83d3e_0 conda-forge libcups 2.3.3 hf5a7f15_1 conda-forge libdb 6.2.32 h9c3ff4c_0 conda-forge libdeflate 1.12 h166bdaf_0 conda-forge libedit 3.1.20191231 he28a2e2_2 conda-forge libevent 2.1.10 h9b69904_4 conda-forge libffi 3.4.2 h7f98852_5 conda-forge libflac 1.3.4 h27087fc_0 conda-forge libgcc-ng 12.1.0 h8d9b700_16 conda-forge libgfortran-ng 12.1.0 h69a702a_16 conda-forge libgfortran5 12.1.0 hdcd56e2_16 conda-forge libglib 2.70.2 h174f98d_4 conda-forge libgomp 12.1.0 h8d9b700_16 conda-forge libiconv 1.16 h516909a_0 conda-forge libllvm14 14.0.5 he0ac6c6_0 conda-forge libnsl 2.0.0 h7f98852_0 conda-forge libogg 1.3.4 h7f98852_1 conda-forge libopus 1.3.1 h7f98852_1 conda-forge libpng 1.6.37 h21135ba_2 conda-forge libpq 14.3 hd77ab85_0 conda-forge libsndfile 1.0.31 h9c3ff4c_1 conda-forge libstdcxx-ng 12.1.0 ha89aaad_16 conda-forge libtiff 4.4.0 hc85c160_1 conda-forge libtool 2.4.6 h9c3ff4c_1008 conda-forge libudev1 249 h166bdaf_4 conda-forge libuuid 2.32.1 h7f98852_1000 conda-forge libuv 1.43.0 h7f98852_0 conda-forge libvorbis 1.3.7 h9c3ff4c_0 conda-forge libwebp 1.2.2 h3452ae3_0 conda-forge libwebp-base 1.2.2 h7f98852_1 conda-forge libxcb 1.13 h7f98852_1004 conda-forge libxkbcommon 1.0.3 he3ba5ed_0 conda-forge libxml2 2.9.14 h22db469_0 conda-forge libzlib 1.2.12 h166bdaf_0 conda-forge lz4-c 1.9.3 h9c3ff4c_1 conda-forge markupsafe 2.1.1 py38h0a891b7_1 conda-forge matplotlib 3.5.2 py38h578d9bd_0 conda-forge matplotlib-base 3.5.2 py38h826bfd8_0 conda-forge mkl 2021.4.0 h06a4308_640 mkl-service 2.4.0 py38h95df7f1_0 conda-forge mkl_fft 1.3.1 py38h8666266_1 conda-forge mkl_random 1.2.2 py38h1abd341_0 conda-forge more-itertools 8.13.0 pypi_0 pypi munkres 1.1.4 pyh9f0ad1d_0 conda-forge mysql-common 8.0.29 haf5c9bc_1 conda-forge mysql-libs 8.0.29 h28c427c_1 conda-forge ncurses 6.3 h27087fc_1 conda-forge nettle 3.6 he412f7d_0 conda-forge networkx 2.8.3 pyhd8ed1ab_0 conda-forge ninja 1.10.2.3 pypi_0 pypi nspr 4.32 h9c3ff4c_1 conda-forge nss 3.78 h2350873_0 conda-forge numpy 1.22.3 py38he7a7128_0 numpy-base 1.22.3 py38hf524024_0 openh264 2.1.1 h780b84a_0 conda-forge openjpeg 2.4.0 hb52868f_1 conda-forge openssl 1.1.1o h166bdaf_0 conda-forge packaging 21.3 pyhd8ed1ab_0 conda-forge pandas 1.3.5 pypi_0 pypi pcre 8.45 h9c3ff4c_0 conda-forge pillow 9.1.1 py38h0ee0e06_1 conda-forge pip 22.1.2 pyhd8ed1ab_0 conda-forge pixman 0.40.0 h36c2ea0_0 conda-forge portaudio 19.6.0 h57a0ea0_5 conda-forge pthread-stubs 0.4 h36c2ea0_1001 conda-forge pulseaudio 14.0 h7f54b18_8 conda-forge pycairo 1.21.0 py38h9c00e7a_1 conda-forge pycparser 2.21 pyhd8ed1ab_0 conda-forge pyopenssl 22.0.0 pyhd8ed1ab_0 conda-forge pyparsing 3.0.9 pyhd8ed1ab_0 conda-forge pyqt 5.15.4 py38h7492b6b_1 conda-forge pyqt5-sip 12.9.0 py38hfa26641_1 conda-forge pysocks 1.7.1 py38h578d9bd_5 conda-forge pystow 0.4.4 pypi_0 pypi pytdc 0.3.6 pypi_0 pypi python 3.8.13 h582c2e5_0_cpython conda-forge python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge python_abi 3.8 2_cp38 conda-forge pytorch 1.11.0 py3.8_cuda10.2_cudnn7.6.5_0 pytorch pytorch-mutex 1.0 cuda pytorch pytorch-scatter 2.0.9 py38_torch_1.11.0_cu102 pyg pytz 2022.1 pyhd8ed1ab_0 conda-forge qt-main 5.15.4 ha5833f6_2 conda-forge rdkit 2022.03.3 py38ha829ea6_0 conda-forge rdkit-pypi 2022.3.3 pypi_0 pypi readline 8.1.2 h0f457ee_0 conda-forge reportlab 3.5.68 py38hadf75a6_1 conda-forge requests 2.28.0 pyhd8ed1ab_0 conda-forge scikit-learn 1.1.1 pypi_0 pypi scipy 1.8.1 pypi_0 pypi seaborn 0.11.2 pypi_0 pypi setuptools 62.3.4 py38h578d9bd_0 conda-forge sip 6.5.1 py38h709712a_2 conda-forge six 1.16.0 pyh6c4a22f_0 conda-forge sqlalchemy 1.4.37 py38h0a891b7_0 conda-forge sqlite 3.38.5 h4ff8645_0 conda-forge tabulate 0.8.9 pypi_0 pypi threadpoolctl 3.1.0 pypi_0 pypi tk 8.6.12 h27826a3_0 conda-forge toml 0.10.2 pyhd8ed1ab_0 conda-forge torch-geometric 2.0.4 pypi_0 pypi torch-sparse 0.6.13 pypi_0 pypi torchaudio 0.11.0 py38_cu102 pytorch torchdrug 0.1.3 he73d2c9 milagraph torchvision 0.12.0 py38_cu102 pytorch tornado 6.1 py38h0a891b7_3 conda-forge tqdm 4.64.0 pyhd8ed1ab_0 conda-forge typing_extensions 4.2.0 pyha770c72_1 conda-forge unicodedata2 14.0.0 py38h0a891b7_1 conda-forge urllib3 1.26.9 pyhd8ed1ab_0 conda-forge wheel 0.37.1 pyhd8ed1ab_0 conda-forge xcb-util 0.4.0 h166bdaf_0 conda-forge xcb-util-image 0.4.0 h166bdaf_0 conda-forge xcb-util-keysyms 0.4.0 h166bdaf_0 conda-forge xcb-util-renderutil 0.3.9 h166bdaf_0 conda-forge xcb-util-wm 0.4.1 h166bdaf_0 conda-forge xorg-kbproto 1.0.7 h7f98852_1002 conda-forge xorg-libice 1.0.10 h7f98852_0 conda-forge xorg-libsm 1.2.3 hd9c2040_1000 conda-forge xorg-libx11 1.7.2 h7f98852_0 conda-forge xorg-libxau 1.0.9 h7f98852_0 conda-forge xorg-libxdmcp 1.1.3 h7f98852_0 conda-forge xorg-libxext 1.3.4 h7f98852_1 conda-forge xorg-libxrender 0.9.10 h7f98852_1003 conda-forge xorg-renderproto 0.11.1 h7f98852_1002 conda-forge xorg-xextproto 7.3.0 h7f98852_1002 conda-forge xorg-xproto 7.0.31 h7f98852_1007 conda-forge xz 5.2.5 h516909a_1 conda-forge zlib 1.2.12 h166bdaf_0 conda-forge zstd 1.5.2 h8a70e8d_1 conda-forge

kajocina commented 2 years ago

I think I found the root of the problem.

It seems that with torchdrug update from 0.1.2 to 0.1.3, there was a change in the key name mentioned here:

"Argument node_feature and edge_feature are renamed to atom_feature and bond_feature in data.Molecule.from_smiles and data.Molecule.from_molecule. The old interface is still supported with deprecated warnings."

This would mean that for this model to work with torchdrug 0.1.3 we'd have to rename "node_feature" key to "atom_feature" or fix torchdrug to 0.1.2.

@cthoyt @benedekrozemberczki opinions?

cthoyt commented 2 years ago

If AstraZeneca wants to pay for some consulting work done for this, I think the best strategy would be to divest from torchdrug completely. The package looked enticing initially but after I've become more familiar with their development, I am more worried that issues like this will arise more frequently.