frnsys / retrosynthesis_planner

Retrosynthesis planner
GNU General Public License v3.0
59 stars 21 forks source link

Can't find eMolecules data 2018 #5

Open likholat opened 3 years ago

likholat commented 3 years ago

Hello!

I got an error when running the download.sh script. These files don't exist: https://github.com/frnsys/retrosynthesis_planner/blob/1715cb23987bf8a6a0daa626deddfd112a1d6b87/download.sh#L14-L15

If I try to use the data for 2020, I get the error:

Traceback (most recent call last):
  File "plan.py", line 24, in <module>
    smi = molvs.standardize_smiles(smi)
  File "/home/user/anaconda3/envs/my-rdkit-env/lib/python3.6/site-packages/molvs/standardize.py", line 305, in standardize_smiles
    mol = Standardizer().standardize(mol)
  File "/home/user/anaconda3/envs/my-rdkit-env/lib/python3.6/site-packages/molvs/standardize.py", line 95, in standardize
    Chem.SanitizeMol(mol)
ValueError: Sanitization error: Explicit valence for atom # 1 Br, 2, is greater than permitted

Can I use the data for 2020 or is it not suitable? Maybe I should make some changes to script for the new data?

Links for the new data: https://downloads.emolecules.com/free/2020-12-01/version.smi.gz https://downloads.emolecules.com/orderbb/2020-12-01/version.smi.gz

quantaosun commented 3 years ago

For me, I just run the download.sh line by line, and definitely changed the molecule links.