connorcoley / rdchiral

Wrapper for RDKit's RunReactants to improve stereochemistry handling
MIT License
151 stars 50 forks source link

False template extracted #30

Closed queliyong closed 2 years ago

queliyong commented 2 years ago

FB991182-E2F3-40de-9705-950577FFC488

Hi, I extracted this reaction's retro template, as follows: '[OH;D1;+0:2]-[c:1].[OH;D1;+0:4]-[c:3]>>[c:1]-[O;H0;D2;+0:2]-C-c1:c:c:c:c:c:1.[c:3]-[O;H0;D2;+0:4]-C-c1:c:c:c:c:c:1'

Obviously, the template is not my desire. I want a retro template that there are only one reactant and one product in the template. What can I do to make it? Thank U very much.

The reaction smiles is 'CC(C)(C)C1=CC(C2=C(CCCC3)C3=CC4=C2CCCC4)=C(OCC5=CC=CC=C5)C(C6=CC(C(C)(C)C)=CC=C6OC[C@@H]7CCCC[C@@H]7COC8=CC=C(C(C)(C)C)C=C8C9=CC(C(C)(C)C)=CC(C%10=C(CCCC%11)C%11=CC%12=C%10CCCC%12)=C9OCC%13=CC=CC=C%13)=C1>>OC1=C(C2=C(CCCC3)C3=CC4=C2CCCC4)C=C(C(C)(C)C)C=C1C5=CC(C(C)(C)C)=CC=C5OC[C@@H]6CCCC[C@@H]6COC7=CC=C(C(C)(C)C)C=C7C8=CC(C(C)(C)C)=CC(C9=C(CCCC%10)C%10=CC%11=C9CCCC%11)=C8O'

connorcoley commented 2 years ago

Hi @queliyong ,

This is the desired behavior of RDChiral in this case, because the reaction involves two simultaneous reaction steps. However, for your use case, it might not be appropriate. If you're comfortable with Python, I would recommend defining a post-processing step where you look to see if a template consists of multiple fragments on the product side (by looking for a period used for concatenation) and, if so, split the product side of the template and try to match the atom map numbers with fragments on the reactants side. If you detect that they can be decoupled, that will then leave you with two templates from the one reaction SMILES rather than just the one. If necessary, you can then compare your final set of templates in the absence of atom mapping to de-duplicate.

queliyong commented 2 years ago

Hi @queliyong ,

This is the desired behavior of RDChiral in this case, because the reaction involves two simultaneous reaction steps. However, for your use case, it might not be appropriate. If you're comfortable with Python, I would recommend defining a post-processing step where you look to see if a template consists of multiple fragments on the product side (by looking for a period used for concatenation) and, if so, split the product side of the template and try to match the atom map numbers with fragments on the reactants side. If you detect that they can be decoupled, that will then leave you with two templates from the one reaction SMILES rather than just the one. If necessary, you can then compare your final set of templates in the absence of atom mapping to de-duplicate.

Thanks a lot for your advice. I will try.