dataprofessor / code

Compilation of R and Python programming codes on the Data Professor YouTube channel.
http://youtube.com/dataprofessor
890 stars 1.42k forks source link

Query regarding practical use of delaney regression model and how well will it predict the solubility value on new drugs. #3

Open adithyaan-creator opened 3 years ago

adithyaan-creator commented 3 years ago

@dataprofessor How well will the current regression model perform on new drugs? On what type of data points(new chemicals) do you think from your perspective the model will perform badly?

dataprofessor commented 3 years ago

Great question, to answer that question we need to perform the "applicability domain" analysis. This can be done by using a PCA scores plot to see whether the new compound falls within the boundaries of the training set compounds or not.

On Thu, Oct 8, 2020 at 1:42 AM Adithya notifications@github.com wrote:

@dataprofessor https://github.com/dataprofessor How well will the current regression model perform on new drugs? On what type of data points(new chemicals) do you think from your perspective the model will perform badly?

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/dataprofessor/code/issues/3, or unsubscribe https://github.com/notifications/unsubscribe-auth/AMLTBY3WLHQQBU4OZTN627TSJSZBBANCNFSM4SHYO5EQ .

adithyaan-creator commented 3 years ago

What are some other places where I can use Machine Learning algorithms in the drug discovery pipeline? Saw one on using RNN for generating SMILE notations.

dataprofessor commented 3 years ago

That's a great question, actually there are so many use cases, and yes amongst that is to generate SMILES notation. One can also apply ML to explore the entire proteome and perform network analysis to visualize the complex protein-protein interactions. Another is to perform drug repurposing of existing drugs for treating a new disease.