connorcoley / rdchiral

Wrapper for RDKit's RunReactants to improve stereochemistry handling
MIT License
151 stars 50 forks source link

unexpected template extracted #15

Open acquaregia opened 4 years ago

acquaregia commented 4 years ago

Hi I have a strange case where the template extractor gives unexpected results. I am cleaning up my reaction database and use the template extractor to spot low quality data. in this case I get unexpected results, and I do not understand if it is a problem of the reaction, the mapper or it comes from the template extractor.

I start with this transformation (a double reduction) ClC(=C1C2C=CC1C3C2C(=O)C=CC3=O)Cl>>ClC(=C1C2CCC1C3C2C(=O)CCC3=O)Cl I map it with indigo, because namerxn fails to classify it and hence to give a mapping

[O:1]=[C:2]1[CH:16]2CH:7[CH:11]2[CH:10]=[CH:9]3)C:5[CH:4]=[CH:3]1>>[O:1]=[C:2]1[CH:16]2CH:7[CH:11]2[CH2:10][CH2:9]3)C:5[CH2:4][CH2:3]1

I obtain from it an unexpected template (forward sense) here: [C:1]=[C:2]1-[C:3]-[CH;D2;+0:4]=[CH;D2;+0:5]-[C:6]-1.[O;D1;H0:7]=[C:8]-[CH;D2;+0:9]=[CH;D2;+0:10]-[C:11]=[O;D1;H0:12]>>[C:1]=[C:2]1-[C:3]-[CH2;D2;+0:4]-[CH2;D2;+0:5]-[C:6]-1.[O;D1;H0:7]=[C:8]-[CH2;D2;+0:9]-[CH2;D2;+0:10]-[C:11]=[O;D1;H0:12]

with two reactants and two products is the mapping wrong? or it is just that I miss the H-H as reactants? thanks a lot for your help. marco

connorcoley commented 4 years ago

That mapped reaction isn't a valid SMILES string -- are you sure that's what Indigo output? Using a correctly mapped version of the reaction:

import template_extractor
smi = '[O:1]=[C:7]1[CH2:2][CH2:3][CH2:4][CH:5]=[CH:6]1>>[O:1]=[C:7]1[CH2:2][CH2:3][CH2:4][CH2:5][CH2:6]1'
reaction = {
    'reactants': smi.split('>')[0],
    'products': smi.split('>')[-1],
    '_id': None,
}
template = template_extractor.extract_from_reaction(reaction)

gives

{'products': '[C:1]-[CH2;D2;+0:2]-[CH2;D2;+0:3]-[C:4]=[O;D1;H0:5]', 'reactants': '[C:1]-[CH;D2;+0:2]=[CH;D2;+0:3]-[C:4]=[O;D1;H0:5]', 'reaction_smarts': '[C:1]-[CH2;D2;+0:2]-[CH2;D2;+0:3]-[C:4]=[O;D1;H0:5]>>[C:1]-[CH;D2;+0:2]=[CH;D2;+0:3]-[C:4]=[O;D1;H0:5]', 'intra_only': True, 'dimer_only': False, 'reaction_id': None, 'necessary_reagent': ''}

Using a mapped version of your double reduction...

smi2 = '[Cl:3]/[C:4]([Cl:16])=[C:5]1[CH:6]2[CH:7]=[CH:8][CH:9]\\1[CH:10]3[CH:11]2[C:12]([CH:13]=[CH:14][C:15]3=[O:1])=[O:2]>>[Cl:3]/[C:4]([Cl:16])=[C:5]4[CH:9]5[CH2:8][CH2:7][CH:6]\\4[CH:11]6[CH:10]5[C:15]([CH2:14][CH2:13][C:12]6=[O:2])=[O:1]'
reaction2 = {
    'reactants': smi2.split('>')[0],
    'products': smi2.split('>')[-1],
    '_id': None,
}
template2 = template_extractor.extract_from_reaction(reaction2)

gives

{'products': '[C:1]=[C:2]1-[C:3]-[CH2;D2;+0:4]-[CH2;D2;+0:5]-[C:6]-1.[O;D1;H0:7]=[C:8]-[CH2;D2;+0:9]-[CH2;D2;+0:10]-[C:11]=[O;D1;H0:12]', 'reactants': '[C:1]=[C:2]1-[C:3]-[CH;D2;+0:4]=[CH;D2;+0:5]-[C:6]-1.[O;D1;H0:7]=[C:8]-[CH;D2;+0:9]=[CH;D2;+0:10]-[C:11]=[O;D1;H0:12]', 'reaction_smarts': '[C:1]=[C:2]1-[C:3]-[CH2;D2;+0:4]-[CH2;D2;+0:5]-[C:6]-1.[O;D1;H0:7]=[C:8]-[CH2;D2;+0:9]-[CH2;D2;+0:10]-[C:11]=[O;D1;H0:12]>>[C:1]=[C:2]1-[C:3]-[CH;D2;+0:4]=[CH;D2;+0:5]-[C:6]-1.[O;D1;H0:7]=[C:8]-[CH;D2;+0:9]=[CH;D2;+0:10]-[C:11]=[O;D1;H0:12]', 'intra_only': True, 'dimer_only': False, 'reaction_id': None, 'necessary_reagent': ''}
acquaregia commented 4 years ago

Thanks, I will double check the output of indigo.

cheers, m