ersilia-os / ersilia

The Ersilia Model Hub, a repository of AI/ML models for infectious and neglected disease research.
https://ersilia.io
GNU General Public License v3.0
198 stars 128 forks source link

Outreachy Documentation Project: Ifeoluwa Favour Ojumoro #97

Closed ifeoluwafavour closed 2 years ago

ifeoluwafavour commented 2 years ago

Applicant: @ifeoluwafavour

Welcome to the Ersilia Open Source Initiative. This issue will serve to track all your contributions for the project “Improve the documentation and outreach material of the Ersilia Model Hub”.

Please tick the tasks as you complete them. To make a final application it is not required to have completed all tasks. Only the Initial Steps and Community sections are REQUIRED. The tasks are not ordered from more to less important, they are simply related to different skills. Start where you feel most comfortable. This project can be adapted to the applicants interests, please focus on the type of tasks that you prefer / have better skills / would like to work on as an intern.


Initial steps:

ifeoluwafavour commented 2 years ago

Why I am interested in Ersilia Ersilia's ultimate goal is to effectively lower the barrier to drug discovery, encouraging academic groups and enterprises to pursue the development of new medicines and this excites me.

I'm an Industrial Chemistry undergraduate and I attended a public lecture in my university here in Nigeria. In that event, lecturers and research scientists complained about the limitations they faced when it came to making substantial progress in their research work which would help them solve pressing issues in the healthcare system.

I have also listened to students who have ideas that can greatly assist in solving these problems but these limitations hold the students back.

However, if they are presented with a tool like Ersilia that makes pre-trained machine learning models available to assist in drug discovery and development of new meficines, more than half of these scientists' problems would be solved.

This is why I'm interested in Ersilia. I would love to be part of the initiative that will take research in the healthcare system in low and middle income countries especially, to a new level of making discoveries with ease.

ifeoluwafavour commented 2 years ago

Initial contributions:

  1. I corrected a typo at Line 47 in the README file https://github.com/ersilia-os/ersilia/pull/18#issue-1181659850

  2. I created an issue template for third-party model incorporation to Ersilia's Model to be suggested by contributors. https://github.com/ersilia-os/ersilia/issues/23

  3. I created an issue to include a Windows Installation guide on Ersilia's Model Hub page. https://github.com/ersilia-os/ersilia/issues/26

  4. I created a pull request where I listed the steps to install Ersilia on Windows operating system through Windows Subsystem for Linux or through a Virtual Machine. I also described the problems I faced while trying to install Ersilia on my system in the pull request description. This pull request solves the issue I raised on including a Windows Installation guide on Ersilia's Model Hub page. https://github.com/ersilia-os/ersilia/pull/67

GemmaTuron commented 2 years ago

Hi @ifeoluwafavour ,

Great work so far. I will incorporate the template for model request once the discussion is closed. For the Windows installation, I am only missing a couple of sentences explaining a bit what a virtual machine is. For example, users might get confused that they have to install python or conda again even if they have it in their computers already. Can you add one line to explain that a virtual machine is like a separate computer?

Next you can continue working on any of the tasks listed above!

ifeoluwafavour commented 2 years ago

Hi @ifeoluwafavour ,

Great work so far. I will incorporate the template for model request once the discussion is closed. For the Windows installation, I am only missing a couple of sentences explaining a bit what a virtual machine is. For example, users might get confused that they have to install python or conda again even if they have it in their computers already. Can you add one line to explain that a virtual machine is like a separate computer?

Next you can continue working on any of the tasks listed above!

Okay. Thank you very much for your feedback. I will go ahead to start making changes now and also start working on a task.

ifeoluwafavour commented 2 years ago

Hi @ifeoluwafavour ,

Great work so far. I will incorporate the template for model request once the discussion is closed. For the Windows installation, I am only missing a couple of sentences explaining a bit what a virtual machine is. For example, users might get confused that they have to install python or conda again even if they have it in their computers already. Can you add one line to explain that a virtual machine is like a separate computer?

Next you can continue working on any of the tasks listed above!

I have created a pull request where I updated the installation steps file explaining why users still need to download Conda, Python and Git for Linux.

https://github.com/ersilia-os/ersilia/pull/116

GemmaTuron commented 2 years ago

Thanks @ifeoluwafavour I have merged your changes! I think the WSL installation is much clearer now. I will let @victorabba finish his work, then we can work to add your work to the official installation guide, I will let you know next steps in this regard. Meanwhile, please check the list of tasks and select one that you find interesting to continue the contribution!

ifeoluwafavour commented 2 years ago

Thank you very much @GemmaTuron I'm currently working on one of the tasks.

ifeoluwafavour commented 2 years ago

Adobe_Post_20220405_1016170.03516428464077359-2.png

ifeoluwafavour commented 2 years ago

Docstring for the ErsiliaModel class

class ErsiliaModel(model_identifier):
    " " "The ErsiliaModel class fetches a model from the Ersilia Model hub.

    Attributes:
        model_identifier(str): Provide model identifier in eos0abc format.
    " " "

    def serve():
        ' ' 'The serve function provides the URL, Process ID and scl of model session.' ' '

    def predict(molecule_name):
        " " "
          Arg:
             molecule_name(str): Provide molecule name in SMILE format or in a .csv file
         Returns:
             predictions: Outcomes could be Antibiotic activity, Target, Vector, Physiochemical property, ADME, CYP450, Toxicity.
        " " "

    def close():
        ' ' 'The close function ends a model session.' ' '
GemmaTuron commented 2 years ago

Hi @ifeoluwafavour !

I am taggin @miquelduranfrigola for feedback on the docstrings, meanwhile continue working on some other task !

ifeoluwafavour commented 2 years ago

Hi @ifeoluwafavour !

I am taggin @miquelduranfrigola for feedback on the docstrings, meanwhile continue working on some other task !

Okay. Thank you for your feedback. I will move on to another task

ifeoluwafavour commented 2 years ago

I have written a summary of EOSI's strategic plan in the form of a blog post.

Here's the link to the document: https://docs.google.com/document/d/18sPRsVA7GmRMQdR5DELKpcy6Jn1pmqvfF_7_s2ZNZeo/edit?usp=drivesdk

ifeoluwafavour commented 2 years ago

Hi @GemmaTuron I didn't see the need to rewrite everything. I just highlighted the main points from the main strategic plan page.

dchidindu5 commented 2 years ago

Hey @ifeoluwafavour you need to grant an access to 3rd party users to your blog post. Here is what it says ife1

ifeoluwafavour commented 2 years ago

Hey @ifeoluwafavour you need to grant an access to 3rd party users to your blog post. Here is what it says ife1

Oh I thought I was to give access to the mentor only.

I'll change it to be publicly accessible now. Is this okay with you @GemmaTuron ?

dchidindu5 commented 2 years ago

That's why It's an open-source project Every thing should be transparent

On Fri, Apr 8, 2022, 10:02 AM Ife @.***> wrote:

Hey @ifeoluwafavour https://github.com/ifeoluwafavour you need to grant an access to 3rd party users to your blog post. Here is what it says [image: ife1] https://user-images.githubusercontent.com/18760267/162338591-4b2b4323-b3ba-46e6-8636-e03ef87f47e4.PNG

Oh I thought I was to give access to the mentor only.

I'll change it to be publicly accessible now.

— Reply to this email directly, view it on GitHub https://github.com/ersilia-os/ersilia/issues/97#issuecomment-1092625012, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEPEES2FK6U42TSDGQKOFW3VD7Y23ANCNFSM5SNEYLSQ . You are receiving this because you commented.Message ID: @.***>

ifeoluwafavour commented 2 years ago

It's publicly accessible now @dchidindu5

ifeoluwafavour commented 2 years ago

Twitter Template for The Release of New Models

We have a new model! 💃🕺

Ever wondered how a molecule moves? 🤔

With you can see how the molecule of interest moves and also, with what it moves! 😃

Learn more: *picture/video of the model being used


The code is of course open sourced:

Source paper:

Work by: <contributors' or developers' names>


Ersilia Open Source Initiative welcomes model contributions.

You can check out our README in our repository for more information to get started as a contributor.

victorabba commented 2 years ago

Hey @ifeoluwafavour you need to grant an access to 3rd party users to your blog post. Here is what it says ife1

Oh I thought I was to give access to the mentor only.

I'll change it to be publicly accessible now. Is this okay with you @GemmaTuron ?

Hello @ifeoluwafavour that's a nice write-up, but don't you think you can make it more catchy by adding a suitable title?

GemmaTuron commented 2 years ago

Hi @ifeoluwafavour,

Thanks for making the blogpost public. You can choose to make it readable only to the mentor, but of course we encourage to publicly share the information. I have added a few comments in it, specially I think you should focus in summarizing the lists into one or two sentences. You do not need to have all the information, is is a summary, but it makes it nicer for readers if blogposts do not have too many bullet points.

thanks for the comments @victorabba and @dchidindu5. Giving it a title as Victor suggests would be nice as well! Hope you like my comments and if you need further clarification tag me!

GemmaTuron commented 2 years ago

Twitter Template for The Release of New Models

Introducing , A model that predicts activity in molecules with 70% accuracy. (Brief information about the model)

Learn more: *picture/video of the model being used 1/3

The code is of course open sourced:

Source paper: online article or publication that inspired the development of the model.

Work by: <contributors' or developers' names> 2/3

Ersilia Open Source Initiative welcomes model contributions.

You can check out our README in our repository for more information to get started as a contributor.

3/3

Thanks for starting off in this task! Good first ideas, as a general suggestion, think that people scrolls through twitter and only half reads things, so it has to be very catchy. let's try to use a very catchy title, Introducing XXX model might not catch the attention of people scrolling through twitter. Something more like "We have a new model...."

I would not mention accuracy or other numbers in a twitter post, many people may not understand to what are we referring to, or videos as they wont have time to check them.

Refering to authors etc is good!

ifeoluwafavour commented 2 years ago

Hi @ifeoluwafavour,

Thanks for making the blogpost public. You can choose to make it readable only to the mentor, but of course we encourage to publicly share the information. I have added a few comments in it, specially I think you should focus in summarizing the lists into one or two sentences. You do not need to have all the information, is is a summary, but it makes it nicer for readers if blogposts do not have too many bullet points.

thanks for the comments @victorabba and @dchidindu5. Giving it a title as Victor suggests would be nice as well! Hope you like my comments and if you need further clarification tag me!

Yes I've seen your comments and I like them too 😄 I will make the changes as soon as possible. I will also go ahead to add the title and make the blogpost more engaging.

ifeoluwafavour commented 2 years ago

Twitter Template for The Release of New Models

Introducing , A model that predicts activity in molecules with 70% accuracy. (Brief information about the model)

Learn more: *picture/video of the model being used 1/3

The code is of course open sourced:

Source paper: online article or publication that inspired the development of the model.

Work by: <contributors' or developers' names> 2/3

Ersilia Open Source Initiative welcomes model contributions.

You can check out our README in our repository for more information to get started as a contributor.

3/3

Thanks for starting off in this task! Good first ideas, as a general suggestion, think that people scrolls through twitter and only half reads things, so it has to be very catchy. let's try to use a very catchy title, Introducing XXX model might not catch the attention of people scrolling through twitter. Something more like "We have a new model...."

I would not mention accuracy or other numbers in a twitter post, many people may not understand to what are we referring to, or videos as they wont have time to check them.

Refering to authors etc is good!

Alright! Noted. I'll make everything more catchy and make the other changes. Thank you for your feedback.

loweyvana commented 2 years ago

Hi @ifeoluwafavour . Thanks for making your work readable. Great job!

loweyvana commented 2 years ago

Hi @ifeoluwafavour, I saw you ticked the community task. Please could you help me with some clarifications? When they say we should look at other projects, do they mean other Erisilia's projects on outreachy ? or other ersilia's projects in general on their webpage?

ifeoluwafavour commented 2 years ago

Hi @ifeoluwafavour, I saw you ticked the community task. Please could you help me with some clarifications? When they say we should look at other projects, do they mean other Erisilia's projects on outreachy ? or other ersilia's projects in general on their webpage?

It means you have to check other contributors' projects. Just as you've commented on mine

loweyvana commented 2 years ago

Oh ok! Great !!! I was thinking far😅.

Thank youuuu. Let me catch some sleep now.

Kcfreshly commented 2 years ago

@ifeoluwafavour great work.

ifeoluwafavour commented 2 years ago

Oh ok! Great !!! I was thinking far😅.

Thank youuuu. Let me catch some sleep now.

😂😂 okay.

ifeoluwafavour commented 2 years ago

@ifeoluwafavour great work.

Thank you 😁

ifeoluwafavour commented 2 years ago

Twitter Template for The Release of New Models

We have a new model! 💃🕺

Ever wondered how a molecule moves? 🤔

With you can see how the molecule of interest moves and also, with what it moves! 😃

Learn more: *picture/video of the model being used


The code is of course open sourced:

Source paper:

Work by: <contributors' or developers' names>


Ersilia Open Source Initiative welcomes model contributions.

You can check out our README in our repository for more information to get started as a contributor.

Hi @GemmaTuron, I have edited the Twitter post template

ifeoluwafavour commented 2 years ago

I have written a summary of EOSI's strategic plan in the form of a blog post.

Here's the link to the document: https://docs.google.com/document/d/18sPRsVA7GmRMQdR5DELKpcy6Jn1pmqvfF_7_s2ZNZeo/edit?usp=drivesdk

@GemmaTuron I also incorporated the corrections you made in the blog post

Jaya3112 commented 2 years ago

Great work @ifeoluwafavour

loweyvana commented 2 years ago

Twitter Template for The Release of New Models We have a new model! 💃🕺 Ever wondered how a molecule moves? 🤔 With you can see how the molecule of interest moves and also, with what it moves! 😃 Learn more: *picture/video of the model being used

The code is of course open sourced: Source paper: Work by: <contributors' or developers' names>

Ersilia Open Source Initiative welcomes model contributions. You can check out our README in our repository for more information to get started as a contributor.

Hi @GemmaTuron, I have edited the Twitter post template

Great work @ifeoluwafavour , absolutely love it.

ifeoluwafavour commented 2 years ago

Thank you so much @Jaya3112 @loweyvana

ifeoluwafavour commented 2 years ago

I wrote a blog post on artificial intelligence in biomedical research. Here's the link. https://docs.google.com/document/d/13jQqMVx85TuVAiE00flmGymiC3_pUdJSF_WMdF3fu08/edit?usp=drivesdk

GemmaTuron commented 2 years ago

Hi @ifeoluwafavour

Thanks for adding our comments and being so active in the community in both GitHub and Slack, your work is really appreciated. The AI post is well thought, good that you introduce the risk of bias in AI, is a topic we are really interested in. The installation guidelines for Ersilia in Windows are great, we will incorporate them in our guide. I will work with @victorabba to make a section on virtual machines.

I think you are ready to work towards your final application! Perhaps one last task would be to choose a model and suggest a technical card for it, if you want to work further on the project

ifeoluwafavour commented 2 years ago

Okay! 😃

Thank you very much for your kind words @GemmaTuron

I will start working on the technical card.

ifeoluwafavour commented 2 years ago

MODEL TECHNICAL CARD

IDENTIFICATION  Model ID: eos1vms Model name: chembl-multitask-descriptor Model description: the model predicts the main target of a small molecule based on ChEMBL data.

GENERAL INFORMATION  ML algorithm used to train model: Naive Bayes (specifically the Multinomial Naive Bayes algorithm). Data used to train model: a subset of CHEMBL_18 data containing pairs of ligand compounds and single-protein targets. Input: compound Output: target and vector

RESOURCES Repository: https://github.com/ersilia-os/eos1vms Source code: https://github.com/chembl/target_predictions Publication: https://chembl.github.io/ligand-based-target-predictions-in/ License: Apache

ifeoluwafavour commented 2 years ago

MODEL TECHNICAL CARD

IDENTIFICATION  Model ID: eos1vms Model name: chembl-multitask-descriptor Model description: the model predicts the main target of a small molecule based on ChEMBL data.

GENERAL INFORMATION  ML algorithm used to train model: Naive Bayes (specifically the Multinomial Naive Bayes algorithm). Data used to train model: a subset of CHEMBL_18 data containing pairs of ligand compounds and single-protein targets. Input: compound Output: target and vector

RESOURCES Repository: https://github.com/ersilia-os/eos1vms Source code: https://github.com/chembl/target_predictions Publication: https://chembl.github.io/ligand-based-target-predictions-in/ License: Apache

Hi @GemmaTuron

I have written the technical card for a model.

Elizabeth-Joseph-Mawutin commented 2 years ago

MODEL TECHNICAL CARD

IDENTIFICATION  Model ID: eos1vms Model name: chembl-multitask-descriptor Model description: the model predicts the main target of a small molecule based on ChEMBL data.

GENERAL INFORMATION  ML algorithm used to train model: Naive Bayes (specifically the Multinomial Naive Bayes algorithm). Data used to train model: a subset of CHEMBL_18 data containing pairs of ligand compounds and single-protein targets. Input: compound Output: target and vector

RESOURCES Repository: https://github.com/ersilia-os/eos1vms Source code: https://github.com/chembl/target_predictions Publication: https://chembl.github.io/ligand-based-target-predictions-in/ License: Apache

Hello @ifeoluwafavour. Nice work

ifeoluwafavour commented 2 years ago

MODEL TECHNICAL CARD

IDENTIFICATION  Model ID: eos1vms Model name: chembl-multitask-descriptor Model description: the model predicts the main target of a small molecule based on ChEMBL data.

GENERAL INFORMATION  ML algorithm used to train model: Naive Bayes (specifically the Multinomial Naive Bayes algorithm). Data used to train model: a subset of CHEMBL_18 data containing pairs of ligand compounds and single-protein targets. Input: compound Output: target and vector

RESOURCES Repository: https://github.com/ersilia-os/eos1vms Source code: https://github.com/chembl/target_predictions Publication: https://chembl.github.io/ligand-based-target-predictions-in/ License: Apache

Hello @ifeoluwafavour. Nice work

Thank you Elizabeth 😊

julietugo commented 2 years ago

MODEL TECHNICAL CARD

IDENTIFICATION  Model ID: eos1vms Model name: chembl-multitask-descriptor Model description: the model predicts the main target of a small molecule based on ChEMBL data.

GENERAL INFORMATION  ML algorithm used to train model: Naive Bayes (specifically the Multinomial Naive Bayes algorithm). Data used to train model: a subset of CHEMBL_18 data containing pairs of ligand compounds and single-protein targets. Input: compound Output: target and vector

RESOURCES Repository: https://github.com/ersilia-os/eos1vms Source code: https://github.com/chembl/target_predictions Publication: https://chembl.github.io/ligand-based-target-predictions-in/ License: Apache

Great work @ifeoluwafavour

ifeoluwafavour commented 2 years ago

MODEL TECHNICAL CARD

IDENTIFICATION  Model ID: eos1vms Model name: chembl-multitask-descriptor Model description: the model predicts the main target of a small molecule based on ChEMBL data.

GENERAL INFORMATION  ML algorithm used to train model: Naive Bayes (specifically the Multinomial Naive Bayes algorithm). Data used to train model: a subset of CHEMBL_18 data containing pairs of ligand compounds and single-protein targets. Input: compound Output: target and vector

RESOURCES Repository: https://github.com/ersilia-os/eos1vms Source code: https://github.com/chembl/target_predictions Publication: https://chembl.github.io/ligand-based-target-predictions-in/ License: Apache

Great work @ifeoluwafavour

Thank you @julietugo ☺

GemmaTuron commented 2 years ago

Hi @ifeoluwafavour If only, the card is missing the number of molecules in the dataset (sometimes this is not available so don't worry). The rest is good, thanks!

ifeoluwafavour commented 2 years ago

Okay Thank you very much for everything @GemmaTuron Should I close the issue now?