rxn4chemistry / rxnmapper

RXNMapper: Unsupervised attention-guided atom-mapping. Code complementing our Science Advances publication on "Extraction of organic chemistry grammar from unsupervised learning of chemical reactions" (https://advances.sciencemag.org/content/7/15/eabe4166).
http://rxnmapper.ai
MIT License
286 stars 68 forks source link

Error when installing from PIP #10

Closed villalbamartin closed 3 years ago

villalbamartin commented 3 years ago

I have followed the installation instructions using PIP with some minimal changes:

# Small PIP downgrade to solve this bug: https://stackoverflow.com/a/26372051
conda create -n rxnmapper python=3.6 pip=20.2.1
conda activate rxnmapper
conda install -c rdkit rdkit=2020.03.3.0
pip install rxnmapper
# Specific version of Torch for my GPU
conda install pytorch torchvision cudatoolkit=10.1 -c pytorch

I then ran the sample code provided in the "Basic Usage" section in a file called test.py

from rxnmapper import RXNMapper
rxn_mapper = RXNMapper()
rxns = ['CC(C)S.CN(C)C=O.Fc1cccnc1F.O=C([O-])[O-].[K+].[K+]>>CC(C)Sc1ncccc1F', 'C1COCCO1.CC(C)(C)OC(=O)CONC(=O)NCc1cccc2ccccc12.Cl>>O=C(O)CONC(=O)NCc1cccc2ccccc12']
results = rxn_mapper.get_attention_guided_atom_maps(rxns)

But this yielded the following error:

Traceback (most recent call last):
  File "test.py", line 2, in <module>
    rxn_mapper = RXNMapper()
  File "/home/villalbamartin/.conda/envs/rxnmapper/lib/python3.6/site-packages/rxnmapper/core.py", line 65, in __init__
    self.model, self.tokenizer = self._load_model_and_tokenizer()
  File "/home/villalbamartin/.conda/envs/rxnmapper/lib/python3.6/site-packages/rxnmapper/core.py", line 92, in _load_model_and_tokenizer
    vocab_path, max_len=model.config.max_position_embeddings
  File "/home/villalbamartin/.conda/envs/rxnmapper/lib/python3.6/site-packages/rxnmapper/tokenization_smiles.py", line 45, in __init__
    self.max_len_single_sentence = self.max_len - 2
AttributeError: 'SmilesTokenizer' object has no attribute 'max_len'

I believe that this error is due to changes in the transformers library. pip freeze reveals that the installed version is 4.1.1, but the property "max_len" was only present in Tokenizers until version 3.5.1 - here's a link to the deprecation notice. For backwards compatibility reasons, some traces of it remain in the current transformers library.

The reason why I emphasize "believe" is because the Github install succeeded, even though it freezes transformers to 4.0.0 which should exhibit the same issue but, in my case, doesn't. Perhaps a pre-trained model with the old property is being loaded in one situation and not in the other, but I haven't looked deep enough to be certain.

bhoov commented 3 years ago

Thanks for bringing this up! It seems that we forgot to push the latest changes up to Pypi. Should work now!

queliyong commented 3 years ago

Thanks for bringing this up! It seems that we forgot to push the latest changes up to Pypi. Should work now!

No, it does not work yet

bhoov commented 3 years ago

Can you check what version of RXNMapper was installed through pip? (pip freeze | grep rxnmapper)?

queliyong commented 3 years ago

Can you check what version of RXNMapper was installed through pip? (pip freeze | grep rxnmapper)? Pip install works when I switch to linux instead of windows10

bhoov commented 3 years ago

Interesting. If you would make a new issue documenting the bugs you are experiencing in Windows 10 (we have not tested on that OS), we can discuss there