ersilia-os / ersilia

The Ersilia Model Hub, a repository of AI/ML models for infectious and neglected disease research.
https://ersilia.io
GNU General Public License v3.0
200 stars 128 forks source link

Outreachy Documentation Project: Jaya Gupta #140

Closed Jaya3112 closed 2 years ago

Jaya3112 commented 2 years ago

Applicant: @Jaya3112

Welcome to the Ersilia Open Source Initiative. This issue will serve to track all your contributions for the project “Improve the documentation and outreach material of the Ersilia Model Hub”.

Please tick the tasks as you complete them. To make a final application it is not required to have completed all tasks. Only the Initial Steps and Community sections are REQUIRED. The tasks are not ordered from more to less important, they are simply related to different skills. Start where you feel most comfortable. This project can be adapted to the applicants interests, please focus on the type of tasks that you prefer / have better skills / would like to work on as an intern.


Initial steps:

Jaya3112 commented 2 years ago

Why I am interested in Ersilia? Ersilia is an open-source initiative that aims to provide AI models to enable scientists to make drug discoveries much faster. They are on a mission to make scientific progress accessible to all. They focus on establishing collaborations in low-resourced settings where the costs of drug discovery are prohibitive. There are millions of people who are suffering from neglected diseases and do not provide proper drugs because of the unavailability of drugs.

This Covid disease is itself an example. Many scientists have great ideas to resolve covid issues but because of limitations, they were not able to do. If they were able to implement their ideas then we might not lose our close ones. Covid was a concerning issue. But there are many neglected diseases that should be treated in a proper way.

Ersilia is concerned about these neglected diseases. I want to be part of this revolutionary mission as it can save millions of lives worldwide. It is an honor for me to work with this organization. I want to become a reason for smiling faces (people who get proper medications). I would love to be part of the initiative that will take research in the healthcare system and make new discoveries that ease the lives of millions of people.

@GemmaTuron and @miquelduranfrigola thank you so much for providing this opportunity to us.

Jaya3112 commented 2 years ago

Contribution :

Added context with image, which will help new contributors. https://github.com/ersilia-os/ersilia/pull/111#issue-1191177937

Jaya3112 commented 2 years ago

I have created a README.md file. Please give your suggestions @GemmaTuron and @miquelduranfrigola. Here is the link: https://github.com/ersilia-os/ersilia/pull/209#issue-1198765585

Jaya3112 commented 2 years ago

Hi @GemmaTuron, I hope you are doing well. Here is my blog post on the Strategic plan 2021-2023. Have a look and let me know what are the things that needed to be improved. https://docs.google.com/document/d/1Cp6HXsvKcKdW7OWwMBY79DypS9Hv-sEZ-YpKOlCEf2Q/edit?usp=sharing

GemmaTuron commented 2 years ago

Hi @Jaya3112

Good work on the blogpost, no major comments! Also the contribution guidelines with images were helpful, they have been merged in the Contribution.md file The readme needs to be in the /documentation folder before I merge it !

I would suggest you choose one task from the list and focus on it, then you can prepare your final application, thanks!

Jaya3112 commented 2 years ago

Hello, @GemmaTuron Thank you for your suggestion. Hope you are doing good. I have added the README.md file to documentation folder you can check it. Thanks!

Jaya3112 commented 2 years ago

Hi @GemmaTuron. Hope You are doing great. I have created a slide to explain Ersilia’s mission and vision. Here is the link given below : https://docs.google.com/presentation/d/1yJmpK1ifw-Gt4qnOlOPEWaI_P6mWGAoPdRxwa3kV8SI/edit#slide=id.p Please give your suggestion. Thanks!

Jaya3112 commented 2 years ago

Hello @GemmaTuron, Here is my Twitter post template. Please give your suggestions.

We have a "NEW MODEL IN THE ERSILIA TOWN..... " 🚀🚀

Hello Everyone, We have a new Model for our Hub........... 🤩 Introducing a model. Excited to know, what this model does? follow the thread…… Design of the model (a brief narration about the model). Something more is waiting for you......... Usage of this model (describing function and specification of the model). Eager to know more about this model? Click Here (website)

The Ersilia Open Source Initiative, "Welcomes & Value Every Contribution".

loweyvana commented 2 years ago

Hi @GemmaTuron. Hope You are doing great. I have created a slide to explain Ersilia’s mission and vision. Here is the link given below : https://docs.google.com/presentation/d/1yJmpK1ifw-Gt4qnOlOPEWaI_P6mWGAoPdRxwa3kV8SI/edit#slide=id.p Please give your suggestion. Thanks!

Love this Jaya!

ifeoluwafavour commented 2 years ago

Lovely work Jaya Well done!!!

Jaya3112 commented 2 years ago

Thankyou so much @loweyvana & @ifeoluwafavour .

Jaya3112 commented 2 years ago

Hello @GemmaTuron .Here is my Newsletter. Please Give you suggestion. https://docs.google.com/document/d/1qbAGtI_QQTcx67EnNCdfxhopy0IIzAIuqW5RlGLFIEo/edit?usp=sharing

Eagerly waiting for your reply. Thanks!

loweyvana commented 2 years ago

Lovely work Jaya.

Jaya3112 commented 2 years ago

Thankyou @loweyvana

Jaya3112 commented 2 years ago

Hello @GemmaTuron. Hope you are doing great. Here is my technical card. Please give your suggestion. Thanks!

MODEL TECHNICAL CARD

IDENTIFICATION

Model ID : eos2gth Model name : maip-malaria Model description : The malaria inhibitor prediction (MAIP) is a platform, whose aim is to develop a consensus model for predicting blood stage malaria inhibition.

GENERAL INFORMATION

Algorithm used for the model: t-distributed Stochastic Neighbor Embedding (t-SNE) is an algorithm performing a nonlinear dimensionality reduction and designed for data visualization. The resulting sparse matrix was used as input for scikit-learn’s implementation of the t-SNE algorithm using a perplexity value of 500.

Data used for the model: Evotec, Johns Hopkins, MRCT, MMV - St. Jude, AZ, GSK, and St. Jude Vendor Library datasets were significantly used. The Medicines for Malaria Venture (MMV) partner provided three additional datasets to be used for training models (MMV5, MMV6, MMV7) and Novartis models also used. Three datasets were used for model validation purposes (the MMV test set, the PubChem dataset, and the St. Jude Screening Set).

Result: Our first goal was to assess the ability of our new software methods and code to reproduce the previous study.

Input : compound Output : Antibiotic activity Mode : Retrained

RESOURCES

Repository : https://github.com/ersilia-os/eos2gth Source code : https://www.ebi.ac.uk/chembl/maip/ Publication : https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00487-2 License : MIT

Elizabeth-Joseph-Mawutin commented 2 years ago

Hello Jaya, great work you have done so far 👍.

Jaya3112 commented 2 years ago

Thankyou @ElizabethMawutin

Jaya3112 commented 2 years ago

Hello @GemmaTuron. I know you are busy with many important works. Please give your view on my work. I am eagerly waiting for your suggestions and want to do more for this community. I want to be a part of this wonderful community and want to help more people and save millions of lives. It will be an honor for me. Thanks!

Jaya3112 commented 2 years ago

Hello @GemmaTuron. Hope you are doing well. Please give your view. Thanks! After searching the scientific literature. I suggest three new models that would be relevant to incorporate into the Hub is as follows:

DeepNeuralNetQSAR: Python-based system driven by computational tools that aid detection of the molecular activity of compounds. https://github.com/Merck/DeepNeuralNet-QSAR

Organic : A molecular generation tool that helps to create molecules with desired properties. https://github.com/aspuru-guzik-group/ORGANIC

Lipophilicity in drug discovery: The role of lipophilicity in determining the overall quality of candidate drug molecules is of paramount importance. Recent developments suggest that, determining pre-clinical ADMET (absorption, distribution, metabolism, elimination, and toxicology) properties. https://github.com/VEK239/StructGNN-lipophilicity https://github.com/awslabs/dgl-lifesci/blob/master/python/dgllife/data/lipophilicity.py

Jaya3112 commented 2 years ago

Hello @GemmaTuron. Hope you are doing well. Here is the link to my blog 2 on the topic:" AI for biomedical research" https://docs.google.com/document/d/1PfVffSxo-a7SOJn-jNRdHAGIMovuf8ObdXeHCQ-EI74/edit?usp=sharing

please review it. I am eagerly waiting for my documentation to get reviewed and you will provide me with suggestions. Thanks!

loweyvana commented 2 years ago

Jaya you are doing an amazing work. Well done

Jaya3112 commented 2 years ago

Thank you so much @loweyvana.

GemmaTuron commented 2 years ago

Hi @Jaya3112 Thanks for all the work.

The model cards are looking good, with all the necessary links on them! For the models suggested, the last one is actually in our list already! The molecle generator is cool, and we do use these kind of models, but for the moment they are not the focus of the Hub. In the lipophilicity model I fail to see one suggestion, it opens a google search, can you check that? I also like the historic references on the blogpost, good job! Please now focus on preparing your final application.

Jaya3112 commented 2 years ago

Thank you @GemmaTuron for your suggestions. I will consider this and do the needful. I will make the suggested model more appropriate. Yes, I will surely focus on the final application. Thanks!

Jaya3112 commented 2 years ago

Hi @GemmaTuron, I have modified the suggested model. Please review it. Thanks for your support and for making us a part of this revolutionary change.

Jaya3112 commented 2 years ago

Hello @GemmaTuron. Hope you are doing well. Please give your view. Thanks! After searching the scientific literature. I suggest three new models that would be relevant to incorporate into the Hub is as follows:

DeepNeuralNetQSAR: Python-based system driven by computational tools that aid detection of the molecular activity of compounds. https://github.com/Merck/DeepNeuralNet-QSAR

Organic : A molecular generation tool that helps to create molecules with desired properties. https://github.com/aspuru-guzik-group/ORGANIC

Lipophilicity in drug discovery: The role of lipophilicity in determining the overall quality of candidate drug molecules is of paramount importance. Recent developments suggest that, determining pre-clinical ADMET (absorption, distribution, metabolism, elimination, and toxicology) properties. https://github.com/VEK239/StructGNN-lipophilicity https://github.com/awslabs/dgl-lifesci/blob/master/python/dgllife/data/lipophilicity.py

I modified it please review it and give your suggestion.

Jaya3112 commented 2 years ago

Thank you @GemmaTuron and @miquelduranfrigola for this amazing opportunity. Hope we can do more work together and fulfill the mission.