open-reaction-database / ord-data

Official data repository for the Open Reaction Database
https://open-reaction-database.org
Creative Commons Attribution Share Alike 4.0 International
236 stars 60 forks source link

add amidation dataset #175

Closed beef-broccoli closed 1 year ago

beef-broccoli commented 1 year ago

Dataset of a amide coupling reaction with various activators, bases, solvents.

Associated with publication, Fig. 5 and related texts in the manuscript. Experimental details can be found in SI Section 10.

Reaction template, result spreadsheet and dataset pbtxt files are also included.

submission 08-21-23.zip

connorcoley commented 1 year ago

Thanks!!

It is not essential to make this change, but the pre-stirring for 30 minutes before addition of amine could be represented by the addition_time field for the amine input. The addition_order is the typical thing to specify, as you've done, but there's even more detail that could be included with addition_time. That would let us set conditions_are_dynamic back to False.

The SI specifies "overnight", which in the template is 16 hours while Figure 5 shows 24 hours -- do you know which it should be? Everything else looks perfect to me in terms of procedure. I spot checked a few structures only.

beef-broccoli commented 1 year ago

Thanks for the feedback! Addition time is changed for the amine, and reaction time is corrected to 24 hours (i guess our procedure also was not clear on reaction time).

One potential issue i spotted is for some of the products, maybe because of the conversion from inchi to smiles, the amide structure sometimes show up weird, with the double bond put on the C-N bond. Only some of the structures have this issue. Is this something we want to fix? I can try to clean it up

connorcoley commented 1 year ago

Oh that's a good call. Roundtripping an iminol through RDKit won't automatically convert it to the main amide tautomer. It might be nice to do that conversion now, when we know it's "safe", as opposed to expecting downstream users to. It will also help with search/etc.

You should be able to do this with...

import rdkit.Chem as Chem
from rdkit.Chem.MolStandardize import rdMolStandardize
new_mol = rdMolStandardize.CanonicalTautomer(Chem.MolFromSmiles(smi))
new_smi = Chem.MolToSmiles(new_mol)

(assuming there are no other ambiguous/nonstandard substructures in the products)

beef-broccoli commented 1 year ago

All the amide structures are fixed! Thanks for the help

submission 08-31-23.zip