MolecularAI / aizynthfinder

A tool for retrosynthetic planning
https://molecularai.github.io/aizynthfinder/
MIT License
562 stars 128 forks source link

regarding ringbreaker template, neither _apply_with_rdchiral or _apply_with_rdkit return null reactants list #152

Closed yangxiaofei77 closed 3 months ago

yangxiaofei77 commented 3 months ago

hello, @SGenheden ,nice to meet you again.

When I try to use aizynthfinder to predict reactants related to ringbreaker reaction, I run into following issue.

1) use aizynthtrain try prepare trainning data for ringbreaker reaction. Here, I set the expand_ring=true and expand_hetero=true when generating template for ringbreaker reactions;

def generate_templates(
data: pd.DataFrame,
radius: int,
expand_ring: bool,
expand_hetero: bool,
ringbreaker_column: str,
smiles_column: str,

)

2) after trainning the ringbreaker templates, I get a model for ringbreaker, for example, the retro_templete looks like below:

example 1: [C;H0;D4;+0:1]1-[NH;D2;+0:2]-[CH2;D2;+0:3]-[CH2;D2;+0:4]-[O;H0;D2;+0:5]-1>>O=[C;H0;D3;+0:1].[NH2;D1;+0:2]-[CH2;D2;+0:3]-[CH2;D2;+0:4]-[OH;D1;+0:5] exampe 2: [CH2;D2;+0:1]1-[O;H0;D2;+0:2]-[CH2;D2;+0:3]-[CH2;D2;+0:4]-[c;H0;D3;+0:5]:[c;H0;D3;+0:6]-1>>O=[CH2;D1;+0:1].[OH;D1;+0:2]-[CH2;D2;+0:3]-[CH2;D2;+0:4]-[c;H0;D3;+0:5]:[cH;D2;+0:6]

3) while predicting a product containnig ring using ringbreaker model, neither _apply_with_rdchiral() or _apply_with_rdkit returns null reactants list, this leads to failure of predict product containning ring;

4) i double check the possible_actions list, this possible_actions contains more than 30 retro actions predicted by the model. All these retro actions , have the same issue as step 3 described, which returns null reactants list;

@SGenheden , I have questions regarding this issue:

1) what 's possible reason leading to null reactants list for ringbreaker related prediction ? could you give some clues?

2) i am thinking when prepareing ringbreaker template, these two parameters may have influance . expand_ring=true and expand_hetero=true. Shall I try set these two parameters to false ?

3) another reason I guess might be the orignal ringbreaker reactions with poor qaulity, which leads to this issue;

4) also, I think , one possible reason might be the number of ringbreaker template my model is limited. This leads to although the model predicts some retro reactions. However, All these retro reactions can't work well with provided target mol.

@SGenheden , Samnel Genheden, Any suggestions are appreciated . Many Thanks.

   Philip Yang
yangxiaofei77 commented 3 months ago

@SGenheden ,one note is that the ringbreaker model trained , works for some target mol. for example, if i provide target mol with smiles CC1NCCO1, the model predicts the reactant list with following template:

template: [CH3:1][CH:2]1[NH:3][CH2:4][CH2:5][O:6]1>>[CH3:1][CH:2]=[O:7].[NH2:3][CH2:4][CH2:5][OH:6]

and get the predicted reactants, for example like below. In this case, what is the possible reason why ringbreaker model doesn't work for some provided target mol ?

Thanks, Philip Yang ............ "template": "[CH2;D2;+0:3]1-[CH2;D2;+0:4]-[O;H0;D2;+0:5]-[CH;D3;+0:1]-[NH;D2;+0:2]-1>>O=[CH;D2;+0:1].[NH2;D1;+0:2]-[CH2;D2;+0:3]-[CH2;D2;+0:4]-[OH;D1;+0:5]", "mapped_reaction_smiles": "[CH3:1][CH:2]1[NH:3][CH2:4][CH2:5][O:6]1>>[CH3:1][CH:2]=[O:7].[NH2:3][CH2:4][CH2:5][OH:6]" }, "children": [ { "type": "mol", "hide": false, "smiles": "CC=O", "is_chemical": true, "in_stock": false }, { "type": "mol", "hide": false, "smiles": "NCCO", "is_chemical": true, "in_stock": true } ] ......................

SGenheden commented 3 months ago

Hello, thank for your question.

the expand_ring and expand_hetereo arguments turns on the new logic to extract ringbreaking templates introduced in the AiZynthTrain paper (https://pubs.acs.org/doi/10.1021/acs.jcim.2c01486), so they are not essential - but warmly recommended. We believe this logic is essential for the improved performance of the ringbreaker model.

I am unsure if the mean that the code raises an exception while applying your templates, or if you just don't get any reactants. The latter case can of course happen if none of the top-ranked templates are in fact applicable on the product. That could be an indication that the model hasn't trained sufficient, but you could just be unlucky.

The templates you provided looks on a first glance alright, but to dig further I would need a product SMILES where the templates does not a reactant but you believe it should.

Hope this helps to straighten out some question marks.

yangxiaofei77 commented 3 months ago

@SGenheden , thanks for your answer. This helps. I will continue to try and improve the performance of ringbreaker model.

Regards, Philip Yang