Louchaofeng / IDL-PPBopt

Code for "IDL-PPBopt: A Strategy for Prediction and Optimization of Human Plasma Protein Binding of Compounds via an Interpretable Deep Learning Method"
GNU General Public License v3.0
8 stars 8 forks source link

Not predict the PPB fraction for some molecules #1

Open carcablop opened 1 year ago

carcablop commented 1 year ago

Hello. I am trying to use this work to predict the PPB fractions. I am passing as input this file "eml_canonicals.csv". This is the input file: eml_canonical.csv which contains the smiles that I want to calculate the ppb fractions. But I find that the model does not predict the ppb fractions for some smiles. They are the following:

[CaH2]
[F-]
[I]
O
[Cl-].[K+]
[I-].[K+]
S
N.N.[Ag+].[F-]
[Cl-].[Na+]

I approach you to ask, Why for these molecules it is not possible to calculate the PPB values? and Why not taken into account in the final output?. If you like, you can take a look at the work we are doing in Ersilia to incorporate this model: https://github.com/ersilia-os/ersilia/issues/590. This is the link to the issue where we are following up to address this model. https://github.com/ersilia-os/ersilia/issues/590#issuecomment-1433231033

Thanks in advance. I appreciate an answer, to achieve a clear justification of what other alternatives we could address to deal with these molecules.

Louchaofeng commented 1 year ago

Hello.

Thank you for your interest in my research. I will answer your questions in two ways:

1)From an algorithmic perspective, the initial bond vector could not be generated for these compounds that were failed to be predicted with AFP model. More specifically, we called the GetBond() function from RDKit to calculate the bond features, But it failed to detect bond from these compounds. Moreover, you can learn more about AFP model from this paper (https://doi.org/10.1021/acs.jmedchem.9b00959). ​

2) From an perspective of model applicability domain, The AFP model was constructed using a training set free of inorganics and salts. Therefore, even if the model can predict these compounds, the prediction results is inaccurate. My study focuses on the field of drug development and therefore ignores the prediction of inorganics and salts.

Hope this helps you.

Chaofeng Lou

------------------ 原始邮件 ------------------ 发件人: "Louchaofeng/IDL-PPBopt" @.>; 发送时间: 2023年2月16日(星期四) 晚上11:39 @.>; @.***>; 主题: [Louchaofeng/IDL-PPBopt] Not predict the PPB fraction for some molecules (Issue #1)

Hello. I am trying to use this work to predict the PPB fractions. I am passing as input this file "eml_canonicals.csv". This is the input file: eml_canonical.csv which contains the smiles that I want to calculate the ppb fractions. But I find that the model does not predict the ppb fractions for some smiles. They are the following: [CaH2] [F-] [I] O [Cl-].[K+] [I-].[K+] S N.N.[Ag+].[F-] [Cl-].[Na+]
I approach you to ask, Why for these molecules it is not possible to calculate the PPB values and Why not taken into account in the final output. If you like, you can take a look at the work we are doing in Ersilia to incorporate this model: ersilia-os/ersilia#590. This is the link to the issue where we are following up to address this model. ersilia-os/ersilia#590 (comment)

Thanks in advance. I appreciate an answer, to achieve a clear justification of what other alternatives we could address to deal with these molecules.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.Message ID: @.***>

carcablop commented 1 year ago

Hello @Louchaofeng.

Thanks a lot for the answer. This information has been very clear to us and very useful. We will take this into account in the interpretation of the model output.

Thank you.

Carolina Caballero