sustainable-processes / pura

Clean chemical data quickly
MIT License
10 stars 3 forks source link

Create functions to load reactions #36

Open marcosfelt opened 1 year ago

marcosfelt commented 1 year ago

We need a way for users to load in reaction data. Everything will be stored in inputs and product for actual use in transforms.

On reaction identifier formats: I think it's best to only support SMILES and RXN block as this is what RDKit can support. Those are also supported by ChemDraw. I looked at RInChi, but the software to support has terrible documentation, and I'm not sure anyone uses it.

So given that, we'll create some tools for standard databases and then functions for generic supported formats:

On reaction SMILES (from @ad1arsh): There are 2 possible formats that I'm aware of: 

So, so if have a SMILES in the second format, we can allow the user to pass a list of reagents, catalysts and solvents; otherwise, everything in the e and f position gets classified as an agent. This means we need to add an extra class for Agent with options for null, catalyst, reagent, and solvent.