sustainable-processes / pura

Clean chemical data quickly
MIT License
10 stars 3 forks source link

idea: add IUPAC name support via pybacting and OPSIN #44

Closed egonw closed 1 year ago

egonw commented 1 year ago

With pybacting you have access to OPSIN which can convert many IUPAC names to structures:

from pybacting import cdk
from pybacting import opsin

mol = opsin.parseIUPACName("butane")
smiles = cdk.calculateSMILES(mol)
marcosfelt commented 1 year ago

Thanks for this! We actually already have opsin implemented: https://github.com/sustainable-processes/pura/blob/main/pura/services/opsin.py

We just need better docs

egonw commented 1 year ago

Note that the above code does not require the OPSIN website.

marcosfelt commented 1 year ago

Very cool! I'm hesitant to add something with java dependencies. I'd like to keep the number of dependencies to a minimum.

egonw commented 1 year ago

Yes, understood. JFTR, pybacting seems to run fine even on Google Colab. It seems that @cthoyt got it to work quite stable. But indeed it adds more MBs to download.