Closed narayanamayya closed 5 months ago
Hi, I think it might be due to some package incompatibility issues. Could you please provide:
I'm trying to run 2nd colab example you have provided - "Use a pretrained model in python script Colab". For rdkit i did !pip install rdkit. Thank you.
In ubuntu machine
_libgcc_mutex 0.1 main
_openmp_mutex 5.1 1_gnu
asttokens 2.4.1 pyhd8ed1ab_0 conda-forge
bzip2 1.0.8 h7b6447c_0
ca-certificates 2024.2.2 hbcca054_0 conda-forge
certifi 2024.2.2 pypi_0 pypi
charset-normalizer 3.3.2 pypi_0 pypi
comm 0.2.1 pyhd8ed1ab_0 conda-forge
debugpy 1.6.7 py312h6a678d5_0
decorator 5.1.1 pyhd8ed1ab_0 conda-forge
exceptiongroup 1.2.0 pyhd8ed1ab_2 conda-forge
executing 2.0.1 pyhd8ed1ab_0 conda-forge
expat 2.5.0 h6a678d5_0
filelock 3.13.1 pypi_0 pypi
fsspec 2024.2.0 pypi_0 pypi
huggingface-hub 0.20.3 pypi_0 pypi
idna 3.6 pypi_0 pypi
importlib-metadata 7.0.1 pyha770c72_0 conda-forge
importlib_metadata 7.0.1 hd8ed1ab_0 conda-forge
ipykernel 6.29.2 pyhd33586a_0 conda-forge
ipython 8.21.0 pyh707e725_0 conda-forge
jedi 0.19.1 pyhd8ed1ab_0 conda-forge
jinja2 3.1.3 pypi_0 pypi
jupyter_client 8.6.0 pyhd8ed1ab_0 conda-forge
jupyter_core 5.5.0 py312h06a4308_0
ld_impl_linux-64 2.38 h1181459_1
libffi 3.4.4 h6a678d5_0
libgcc-ng 11.2.0 h1234567_1
libgomp 11.2.0 h1234567_1
libsodium 1.0.18 h36c2ea0_1 conda-forge
libstdcxx-ng 11.2.0 h1234567_1
libuuid 1.41.5 h5eee18b_0
markupsafe 2.1.5 pypi_0 pypi
matplotlib-inline 0.1.6 pyhd8ed1ab_0 conda-forge
mpmath 1.3.0 pypi_0 pypi
ncurses 6.4 h6a678d5_0
nest-asyncio 1.6.0 pyhd8ed1ab_0 conda-forge
networkx 3.2.1 pypi_0 pypi
numpy 1.26.4 pypi_0 pypi
nvidia-cublas-cu12 12.1.3.1 pypi_0 pypi
nvidia-cuda-cupti-cu12 12.1.105 pypi_0 pypi
nvidia-cuda-nvrtc-cu12 12.1.105 pypi_0 pypi
nvidia-cuda-runtime-cu12 12.1.105 pypi_0 pypi
nvidia-cudnn-cu12 8.9.2.26 pypi_0 pypi
nvidia-cufft-cu12 11.0.2.54 pypi_0 pypi
nvidia-curand-cu12 10.3.2.106 pypi_0 pypi
nvidia-cusolver-cu12 11.4.5.107 pypi_0 pypi
nvidia-cusparse-cu12 12.1.0.106 pypi_0 pypi
nvidia-nccl-cu12 2.19.3 pypi_0 pypi
nvidia-nvjitlink-cu12 12.3.101 pypi_0 pypi
nvidia-nvtx-cu12 12.1.105 pypi_0 pypi
openssl 3.0.13 h7f8727e_0
packaging 23.2 pyhd8ed1ab_0 conda-forge
parso 0.8.3 pyhd8ed1ab_0 conda-forge
pexpect 4.9.0 pyhd8ed1ab_0 conda-forge
pickleshare 0.7.5 py_1003 conda-forge
pillow 10.2.0 pypi_0 pypi
pip 24.0 pypi_0 pypi
platformdirs 4.2.0 pyhd8ed1ab_0 conda-forge
prompt-toolkit 3.0.42 pyha770c72_0 conda-forge
psutil 5.9.0 py312h5eee18b_0
ptyprocess 0.7.0 pyhd3deb0d_0 conda-forge
pure_eval 0.2.2 pyhd8ed1ab_0 conda-forge
pygments 2.17.2 pyhd8ed1ab_0 conda-forge
python 3.12.1 h996f2a0_0
python-dateutil 2.8.2 pyhd8ed1ab_0 conda-forge
pyyaml 6.0.1 pypi_0 pypi
pyzmq 25.1.2 py312h6a678d5_0
rdkit 2023.9.4 pypi_0 pypi
readline 8.2 h5eee18b_0
regex 2023.12.25 pypi_0 pypi
requests 2.31.0 pypi_0 pypi
safetensors 0.4.2 pypi_0 pypi
setuptools 68.2.2 py312h06a4308_0
six 1.16.0 pyh6c4a22f_0 conda-forge
sqlite 3.41.2 h5eee18b_0
stack_data 0.6.2 pyhd8ed1ab_0 conda-forge
sympy 1.12 pypi_0 pypi
t5chem 1.0.0 pypi_0 pypi
tk 8.6.12 h1ccaba5_0
tokenizers 0.15.1 pypi_0 pypi
torch 2.2.0 pypi_0 pypi
torchdata 0.7.1 pypi_0 pypi
torchtext 0.16.2 pypi_0 pypi
tornado 6.3.3 py312h5eee18b_0
tqdm 4.66.1 pypi_0 pypi
traitlets 5.14.1 pyhd8ed1ab_0 conda-forge
transformers 4.37.2 pypi_0 pypi
triton 2.2.0 pypi_0 pypi
typing_extensions 4.9.0 pyha770c72_0 conda-forge
tzdata 2023d h04d1e81_0
urllib3 2.2.0 pypi_0 pypi
wcwidth 0.2.13 pyhd8ed1ab_0 conda-forge
wheel 0.41.2 py312h06a4308_0
xz 5.4.5 h5eee18b_0
zeromq 4.3.5 h6a678d5_0
zipp 3.17.0 pyhd8ed1ab_0 conda-forge
zlib 1.2.13 h5eee18b_0
Hi,
For the first collaboration issue, I'll address the installation problems when I have time. Thank you for reporting this. In the meantime, would you consider trying the Docker container? It's convenient and contains all the necessary dependencies.
Regarding the second issue, it appears that your transformer package and torchtext package are not compatible. These two packages have introduced some backward incompatibilities (see here). You may need to install the correct version to ensure the model works properly.
I suggest trying the docker image, as all the packages are pre-installed. It's been a while since the publication of this model, so the dependencies are somewhat out-of-date ;P. I would greatly appreciate any pull requests that help address these compatibility issues.
Not able to run docker image. Giving the below error. Docker image is trying to fetch models from the Hugging Face repository Thanks
404 Client Error: Not Found for url: https://huggingface.co//work/models/pretrain/simple//resolve/main/config.json Traceback (most recent call last): File "/usr/local/lib/python3.8/dist-packages/transformers-4.10.2-py3.8.egg/transformers/configuration_utils.py", line 524, in get_config_dict resolved_config_file = cached_path(
Hi,
I think huggingface is trying to load a model from online resources. However, the model you used here is not available online. I think it is because I did not deploy it on Hugging Face repository... The pretrained simple model (as well as other models) and datasets need to be downloaded seperately.
For step-by-step directions to run the model with Docker, please check the instruction here:
https://hub.docker.com/repository/docker/hellojocelynlu/t5chem/general
Please also make sure that your --data_dir
and --pretrain
are pointing to correct paths. Otherwise, huggingface will try to search the models online -- which is not a desired behavior.
I managed to build the env from scratch and use develop branch to use it in python xxx.py
style.
The following is my process:
setup env
conda create -n t5chem python==3.8
conda activate t5chem
conda install mkl==2023.0
conda install pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 cudatoolkit=11.6 -c pytorch -c conda-forge
conda install transformers==4.10.2
conda install scikit-learn==0.24.1
conda install scipy==1.6.0
pip install rdkit==2023.9.5
conda install tensorboard
conda install pandas
conda install -c pytorch torchtext
setup t5chem code (use develop branch instead of main to get rid of import error as in #1
git clone https://github.com/HelloJocelynLu/t5chem.git
cd t5chem
git checkout develop
cd ..
Now you should be able to use it like this:
# show version
python t5chem/t5chem/__main__.py -v
# train
python t5chem/t5chem/__main__.py train -h
# predict
python t5chem/t5chem/__main__.py predict -h
I managed to build the env from scratch and use develop branch to use it in
python xxx.py
style. The following is my process:
- setup env
conda create -n t5chem python==3.8 conda activate t5chem conda install mkl==2023.0 conda install pytorch==1.12.0 torchvision==0.13.0 torchaudio==0.12.0 cudatoolkit=11.6 -c pytorch -c conda-forge conda install transformers==4.10.2 conda install scikit-learn==0.24.1 conda install scipy==1.6.0 pip install rdkit==2023.9.5 conda install tensorboard conda install pandas conda install -c pytorch torchtext
- setup t5chem code (use develop branch instead of main to get rid of import error as in import error #1
git clone https://github.com/HelloJocelynLu/t5chem.git cd t5chem git checkout develop cd ..
- Now you should be able to use it like this:
# show version python t5chem/t5chem/__main__.py -v # train python t5chem/t5chem/__main__.py train -h # predict python t5chem/t5chem/__main__.py predict -h
Thank you PoloWitty for the information!
Hi narayanamayya, Thomas recently assisted me in updating the t5chem codebase to ensure compatibility with the newer dependencies. Please feel free to test it out! https://github.com/tkella47/t5chem I have not personally tested it yet, but it is worth a try.
pip install git+https://github.com/tkella47/t5chem
Close the issue due to inactivity
----> 3 from t5chem import SimpleTokenizer 4 model = T5ForConditionalGeneration.from_pretrained(model_path) 5 tokenizer = SimpleTokenizer(vocab_file='t5chem/models/USPTO_500_MT/vocab.pt')
ImportError: cannot import name 'SimpleTokenizer' from 't5chem' (unknown location)
AttributeError: 'SimpleTokenizer' object has no attribute 'vocab'