Graylab / IgFold

Fast, accurate antibody structure prediction from deep learning on massive set of natural antibodies
Other
325 stars 61 forks source link

functional_datapipe #3

Closed anar-rzayev closed 2 years ago

anar-rzayev commented 2 years ago

When I run the antibody structure prediction using IgFoldRunner().fold(...), there becomes an error of "can not import name 'functional_datapipe' from 'torch.utils.data'. Do you know why such an issue occur, it seems that I downloaded every possible package requirement and followed your instructions sequentially, but there is still such an error. Would be appreciated if you could assist in assisting. Thanks for the paper as well, it is highly awesome work!

jeffreyruffolo commented 2 years ago

Can you share more details about your Python environment? I am wondering if this is an issue with particular Python versions.

jeffreyruffolo commented 2 years ago

Providing a package list (by running pip list) might also be helpful.

anar-rzayev commented 2 years ago

I did not download PyRosetta, so instead, I followed instructions for installing OpenMM, and thanks to your updated command with -c, I managed to resolve issues for an initial frozen environment. However, still, pip install -r requirements and all the provided packages, running Antibody Structure Prediction still gives the same issue of functional_datapipe from torch.utils.data. Do you think this occurs due to the Python version? By the way, I am also sure that StreamWrapper would also output similar issues without any reason, while imported packages have been provided below.

anar-rzayev commented 2 years ago

The following is the result from pip list:

absl-py 0.13.0 aiohttp 3.8.1 aiosignal 1.2.0 alabaster 0.7.12 anaconda-client 1.7.2 anaconda-navigator 1.9.12 anaconda-project 0.8.3 antiberty 0.0.5 antlr4-python3-runtime 4.8 anyascii 0.3.1 applaunchservices 0.2.1 appnope 0.1.0 appscript 1.1.1 argh 0.26.2 asgiref 3.4.1 asn1crypto 1.3.0 astroid 2.4.2 astropy 4.0.1.post1 astunparse 1.6.3 async-timeout 4.0.2 atomicwrites 1.4.0 attrs 20.2.0 autopep8 1.5.3 Babel 2.8.0 backcall 0.2.0 backports.entry-points-selectable 1.1.1 backports.functools-lru-cache 1.6.4 backports.shutil-get-terminal-size 1.0.0 backports.tempfile 1.0 backports.weakref 1.0.post1 beautifulsoup4 4.9.1 biopython 1.79 bitarray 1.4.0 bkcharts 0.2 bleach 3.1.5 blis 0.7.7 bokeh 2.1.1 boto 2.49.0 boto3 1.21.34 botocore 1.24.34 Bottleneck 1.3.2 brotlipy 0.7.0 cachetools 4.1.1 catalogue 2.0.7 certifi 2021.10.8 cffi 1.14.0 chardet 3.0.4 charset-normalizer 2.0.4 click 7.1.2 cloudpickle 1.5.0 clyent 1.2.2 colorama 0.4.3 coloredlogs 15.0.1 colorthief 0.2.1 conda 4.12.0 conda-build 3.21.8 conda-pack 0.6.0 conda-package-handling 1.7.2 conda-verify 3.4.2 configparser 5.0.2 console-progressbar 1.1.2 contextlib2 0.6.0.post1 contractions 0.0.52 cryptography 2.9.2 cycler 0.10.0 cymem 2.0.6 Cython 0.29.21 cytoolz 0.10.1 dask 2.20.0 dataclasses 0.6 datasets 2.0.0 decorator 4.4.2 defusedxml 0.6.0 diff-match-patch 20200713 dill 0.3.4 distlib 0.3.4 distributed 2.20.0 distro 1.5.0 Django 3.2.5 django-model-utils 4.1.1 docformatter 1.4 docker-pycreds 0.4.0 docutils 0.15.2 einops 0.3.0 emoji 1.7.0 entrypoints 0.3 et-xmlfile 1.0.1 fairseq 0.10.2 fastcache 1.1.0 fasttext 0.9.2 filelock 3.4.0 flake8 3.8.3 flake8-bugbear 22.3.23 Flask 1.1.2 flatbuffers 1.12 frozenlist 1.3.0 fsspec 2022.3.0 ftfy 5.9 future 0.18.2 fvcore 0.1.5.post20220414 gast 0.4.0 gevent 20.6.2 gitdb 4.0.7 gitdb2 4.0.2 GitPython 3.1.18 glob2 0.7 gmpy2 2.0.8 google-api-core 2.7.1 google-auth 2.6.2 google-auth-oauthlib 0.4.1 google-cloud-core 2.2.3 google-cloud-storage 2.2.1 google-crc32c 1.3.0 google-pasta 0.2.0 google-resumable-media 2.3.2 googleapis-common-protos 1.56.0 greenlet 0.4.16 grpcio 1.34.1 h5py 3.1.0 HeapDict 1.0.1 html5lib 1.1 huggingface-hub 0.5.1 humanfriendly 10.0 hydra-core 1.1.1 idna 2.10 igfold 0.0.6 imageio 2.9.0 imagesize 1.2.0 importlib-metadata 4.2.0 importlib-resources 5.6.0 inflect 5.3.0 intervaltree 3.0.2 invariant-point-attention 0.1.4 iopath 0.1.9 ipykernel 5.3.2 ipython 7.16.1 ipython_genutils 0.2.0 ipywidgets 7.5.1 isort 4.3.21 itsdangerous 1.1.0 jdcal 1.4.1 jedi 0.17.1 Jinja2 2.11.2 jmespath 1.0.0 joblib 0.16.0 json5 0.9.5 jsonlines 3.0.0 jsonschema 3.2.0 jupyter 1.0.0 jupyter-client 6.1.6 jupyter-console 6.1.0 jupyter-core 4.6.3 jupyterlab 2.1.5 jupyterlab-server 1.2.0 keras 2.7.0 keras-nightly 2.5.0.dev2021032900 Keras-Preprocessing 1.1.2 keyring 21.2.1 kiwisolver 1.2.0 langcodes 3.3.0 lazy-object-proxy 1.4.3 libarchive-c 2.9 libclang 12.0.0 lightgbm 3.2.1 llvmlite 0.33.0+1.g022ab0f locket 0.2.0 lxml 4.5.2 Markdown 3.2.2 markdown-it-py 0.5.8 MarkupSafe 1.1.1 matplotlib 3.4.3 mccabe 0.6.1 minecart 0.3.0 mistune 0.8.4 mkl-fft 1.1.0 mkl-random 1.1.1 mkl-service 2.3.0 mock 4.0.2 more-itertools 8.4.0 mpmath 1.1.0 msgpack 1.0.0 multidict 6.0.2 multipledispatch 0.6.0 multiprocess 0.70.12.2 murmurhash 1.0.6 myst-parser 0.12.10 navigator-updater 0.2.1 nbconvert 5.6.1 nbformat 5.0.7 nbzip 0.1.0 networkx 2.4 nltk 3.5 nose 1.3.7 notebook 6.0.3 numba 0.50.1 numexpr 2.7.1 numpy 1.21.2 numpydoc 1.1.0 oauthlib 3.1.0 olefile 0.46 omegaconf 2.1.1 opencv-python 4.5.3.56 OpenMM 7.6.0 openpyxl 3.0.4 opt-einsum 3.3.0 packaging 21.3 pandas 1.0.5 pandocfilters 1.4.2 parso 0.7.0 partd 1.1.0 path 13.1.0 pathlib2 2.3.5 pathtools 0.1.2 pathy 0.6.1 patsy 0.5.1 pdbfixer 1.8.1 pdfminer3k 1.3.4 pep8 1.7.1 pexpect 4.8.0 pickleshare 0.7.5 Pillow 9.1.0 pip 22.0.4 pkginfo 1.5.0.1 platformdirs 2.4.0 pluggy 0.13.1 ply 3.11 portalocker 2.4.0 preshed 3.0.6 prometheus-client 0.8.0 promise 2.3 prompt-toolkit 3.0.5 protobuf 3.12.2 psutil 5.7.0 ptyprocess 0.6.0 py 1.9.0 py-gfm 1.0.2 py-rouge 1.1 py3Dmol 1.8.0 pyahocorasick 1.4.4 pyarrow 7.0.0 pyasn1 0.4.8 pyasn1-modules 0.2.8 pybind11 2.9.2 pycodestyle 2.6.0 pycosat 0.6.3 pycparser 2.20 pycurl 7.43.0.5 pydantic 1.8.2 pyDeprecate 0.3.1 pydocstyle 5.0.2 pydot 1.4.2 pyflakes 2.2.0 Pygments 2.6.1 pylint 2.5.3 PyMuPDF 1.18.15 pyodbc 4.0.0-unsupported pyOpenSSL 19.1.0 pyparsing 2.4.7 PyPDF2 1.26.0 pyrsistent 0.16.0 PySocks 1.7.1 pytesseract 0.3.8 pytest 5.4.3 pytest-datadir 1.3.1 pytest-regressions 2.3.1 python-dateutil 2.8.1 python-jsonrpc-server 0.3.4 python-language-server 0.34.1 pytorch-lightning 1.5.10 pytorch-pretrained-bert 0.5.1 pytorch-ranger 0.1.1 pytorch3d 0.6.1 pytz 2020.1 PyWavelets 1.1.1 PyYAML 6.0 pyzmq 19.0.1 QDarkStyle 2.8.1 QtAwesome 0.7.2 qtconsole 4.7.5 QtPy 1.9.0 readme-renderer 29.0 regex 2022.3.15 requests 2.26.0 requests-mock 1.9.3 requests-oauthlib 1.3.0 requests-toolbelt 0.9.1 responses 0.18.0 rfc3986 1.5.0 rope 0.17.0 rsa 4.6 Rtree 0.9.4 ruamel_yaml 0.15.87 s3transfer 0.5.2 sacrebleu 2.0.0 sacremoses 0.0.49 scikit-image 0.16.2 scikit-learn 1.0.2 scipy 1.4.1 seaborn 0.11.2 Send2Trash 1.5.0 sentry-sdk 1.3.1 setuptools 59.5.0 sh 1.14.2 shortuuid 1.0.1 simplegeneric 0.8.1 singledispatch 3.4.0.3 six 1.15.0 smart-open 5.2.1 smmap 4.0.0 snowballstemmer 2.0.0 sortedcollections 1.2.1 sortedcontainers 2.2.2 soupsieve 2.0.1 spacy 3.1.1 spacy-legacy 3.0.9 spacy-loggers 1.0.2 Sphinx 2.2.2 sphinx-autodoc-typehints 1.10.3 sphinx-rtd-theme 1.0.0 sphinxcontrib-applehelp 1.0.2 sphinxcontrib-devhelp 1.0.2 sphinxcontrib-htmlhelp 1.0.3 sphinxcontrib-jsmath 1.0.1 sphinxcontrib-qthelp 1.0.3 sphinxcontrib-serializinghtml 1.1.4 sphinxcontrib-websupport 1.2.3 spyder 4.1.4 spyder-kernels 1.9.2 SQLAlchemy 1.3.18 sqlparse 0.4.1 srsly 2.4.3 statsmodels 0.11.1 subprocess32 3.5.4 subword-nmt 0.3.8 sympy 1.6.1 tables 3.6.1 tabula-py 2.2.0 tabulate 0.8.9 tblib 1.6.0 tensorboard 2.7.0 tensorboard-data-server 0.6.1 tensorboard-plugin-wit 1.7.0 tensorboardX 2.5 tensorflow 2.7.0 tensorflow-estimator 2.7.0 tensorflow-io-gcs-filesystem 0.22.0 tensorflow-object-detection-api 0.1.1 termcolor 1.1.0 terminado 0.8.3 testpath 0.4.4 textsearch 0.0.21 Theano 1.0.4 thinc 8.0.15 thop 0.0.31.post2005241907 threadpoolctl 2.1.0 tokenizers 0.11.6 toml 0.10.1 tomli 1.2.3 toolz 0.10.0 torch 1.7.1 torch-optimizer 0.3.0 torch-tb-profiler 0.4.0 torchaudio 0.11.0 torchmetrics 0.8.0 torchtext 0.12.0 torchvision 0.12.0 tornado 6.0.4 tqdm 4.62.1 traitlets 4.3.3 transformers 4.18.0 twine 3.4.1 typer 0.3.2 typing_extensions 4.1.1 ujson 1.35 unicodecsv 0.14.1 Unidecode 1.3.4 untokenize 0.1.1 urllib3 1.26.6 vaderSentiment 3.3.2 virtualenv 20.10.0 Wand 0.6.6 wandb 0.11.2 wasabi 0.9.1 watchdog 0.10.3 wcwidth 0.2.5 webencodings 0.5.1 websocket-client 1.3.2 websocket-server 0.6.4 Werkzeug 1.0.1 wheel 0.36.2 widgetsnbextension 3.5.1 wrapt 1.12.1 wurlitzer 2.0.1 wxPython 4.1.1 xgboost 1.5.0 xlrd 1.2.0 XlsxWriter 1.2.9 xlwings 0.19.5 xlwt 1.3.0 xmltodict 0.12.0 xxhash 3.0.0 yacs 0.1.8 yapf 0.31.0 yarl 1.7.2 zict 2.0.0 zipp 3.1.0 zope.event 4.4 zope.interface 4.7.1

jeffreyruffolo commented 2 years ago

Thanks, at first glance I don't see an issue with the dependencies. Can you try updating to igfold==0.0.8 and running the code again? I tried a simple fix and removed the unused torch data import that may be causing the issue.

anar-rzayev commented 2 years ago

Yeah, it seems upgrading to igfold==0.0.8 and deleting torch.utils.data helped to resolve those StreamWrapper and other torchtext-related issues. However, after I also checked your recording video on Youtube about running demos and instructions here, it seems there is still some problem regarding checkpoints that I could not solve at all. You can check from the attachment that IgFoldRunner() results in the infinite freeze of downloading checkpoint files which does not enable to predict of any structures or sequence embeddings. It seems both my server and local results in this unsolvable issue for me.

Screen Shot 2022-04-28 at 15 31 01

.

anar-rzayev commented 2 years ago

As far I understood from the IgFoldRunner code, it downloads 2 checkpoint files but it can not run them somehow. I could not find the reason for that. It is successful in downloading tar.gz file but due to unzipping issues or other problem, it can not use those even if manually downloaded and put into training_models directory

jeffreyruffolo commented 2 years ago

Can you verify the size of the IgFold.tar.gz you are able to download? This file should be ~438MB if the download is successful. After unzipping there should be four checkpoint files.

If the download is failing for some reason, you can also download directly from https://data.graylab.jhu.edu/IgFold.tar.gz. Then that file should be unzipped into igfold/trained_models in order for IgFoldRunner to discover the checkpoint files.

Can you give that a try? Sorry for the trouble!

jeffreyruffolo commented 2 years ago

Closing now but please reopen (or create a new issue) if you are still having issues with loading the models.