ersilia-os / ersilia

The Ersilia Model Hub, a repository of AI/ML models for infectious and neglected disease research.
https://ersilia.io
GNU General Public License v3.0
189 stars 123 forks source link

✍️ Contribution period: Bisola-Ibiwoye #990

Closed BisolaIbiwoye closed 3 months ago

BisolaIbiwoye commented 4 months ago

Week 1 - Get to know the community

Week 2 - Get Familiar with Machine Learning for Chemistry

Week 3 - Validate a Model in the Wild

Week 4 - Prepare your final application

BisolaIbiwoye commented 4 months ago

MOTIVATION STATEMENT FOR INTERNSHIP WITH ERSILIA.

I am incredibly excited to write about my interest in joining the Outreachy program and contributing to the impactful Ersilia project. As a recent immigrant and aspiring data scientist, Outreachy presents an invaluable opportunity to hone my skills and gain valuable experience.

My tech journey began with completing two certifications in Data Analytics and AI Programming with Python on Udacity. My eagerness to apply these newfound skills and struggle to land any job/internship as an entry level immigrant led me to Outreachy, which offers an ideal environment for collaborative learning and remote mentorship in open-source software development. My first week contributing to Ersilia has opened my eyes to the vast and exciting world of open-source collaboration on GitHub, how to install Ersilia using WSL on windows, how to use Ersilia on docker desktop app and how to use Ersilia’s models to make predictions. This experience, coupled with my recent completion of the Udacity course on AI programming with Python, makes Ersilia's platform, featuring pre-trained AI/ML models for biomedical research, the perfect fit for me. Furthermore, Ersilia’s vision of creating a world with egalitarian access to healthcare is something I am passionate about as an African currently working in the healthcare sector. I want to use my skill to contribute to Ersilia's mission to equip laboratories in low- and middle-income countries with the state-of-the-art AI/ML tools for infectious and neglected disease research. Having experienced some of those infectious diseases that almost claimed my life; involving in this project is my way of giving back to society. I am particularly drawn to Ersilia’s scientific and biomedical approach to the use of artificial intelligence for the good of the society. These approaches can be used to develop tools to speed up experiments and reduce the cost of developing new drugs.

The Udacity course I took provided a strong foundation in Python, AI techniques including machine learning algorithms, deep learning frameworks, and utilizing pre-trained models. This knowledge allowed me to build an application with Python scripts for generating predictions (available on my GitHub: https://github.com/BisolaIbiwoye/Develop-Image-Classifier-for-Flowers-with-Deep-Learning ). I possess a strong work ethic, I am a quick learner, and I thrive in collaborative environments. My dedication to continuous learning and a results-oriented mindset will make me a valuable asset to the Ersilia team.

Feedback from past job applications often pointed towards a lack of real-world experience. This internship will bridge that gap by providing hands-on experience with ML and AI on real-world data, propelling my data science career forward. My commitment extends beyond the internship as I plan to continue contributing to Ersilia's open-source project. To further amplify Ersilia's impact, I plan to actively involve myself in various projects and utilise my social media platforms to raise awareness, promoting the valuable resources offered to researchers and scientists.

My technical skills, combined with my teamwork, communication, and collaborative spirit, make me confident in my ability to succeed in the Outreachy program and positively impact the Ersilia project. Thank you for your time and consideration.

GemmaTuron commented 3 months ago

Hi @BisolaIbiwoye

We are on week 3 of the contribution period. Please let us know if you plan to continue with your contribution by the next 2 days, otherwise we will close this issue so we can focus on the applicants who want to make a final application to Ersilia.

BisolaIbiwoye commented 3 months ago

This is the link to the GitHub repository created for week2, thank you.

https://github.com/BisolaIbiwoye/Model-Validation-with-Ersilia-Model-Eos4tcc/tree/master

DhanshreeA commented 3 months ago

@BisolaIbiwoye have you registered your first contribution with the Outreachy website?

DhanshreeA commented 3 months ago

@BisolaIbiwoye Additionally, I do not see Week 2 tasks having been fully completed, you still have to work on reproducibility.

BisolaIbiwoye commented 3 months ago

Yes, I am working on it. I started late due to a medical reason. I am so sorry.

On Tue, 19 Mar 2024, 09:24 Dhanshree Arora, @.***> wrote:

@BisolaIbiwoye https://github.com/BisolaIbiwoye Additionally, I do not see Week 2 tasks having been fully completed, you still have to work on reproducibility.

— Reply to this email directly, view it on GitHub https://github.com/ersilia-os/ersilia/issues/990#issuecomment-2006472697, or unsubscribe https://github.com/notifications/unsubscribe-auth/AX4XGNALLR34M6AEM6QDVZLYY773HAVCNFSM6AAAAABEHRREIGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBWGQ3TENRZG4 . You are receiving this because you were mentioned.Message ID: @.***>

BisolaIbiwoye commented 3 months ago

@BisolaIbiwoye have you registered your first contribution with the Outreachy website?

Yes, i have.

BisolaIbiwoye commented 3 months ago

@BisolaIbiwoye Additionally, I do not see Week 2 tasks having been fully completed, you still have to work on reproducibility.

I have updated the repository with the model reproducibility task and necessary files. Please help review, thank you.

GemmaTuron commented 3 months ago

Hi @BisolaIbiwoye thanks, we will provide feedback within the day today

GemmaTuron commented 3 months ago

@BisolaIbiwoye have you registered your first contribution with the Outreachy website?

I double checked @DhanshreeA and it is there :)

DhanshreeA commented 3 months ago

Yes, I am working on it. I started late due to a medical reason. I am so sorry. On Tue, 19 Mar 2024, 09:24 Dhanshree Arora, @.> wrote: @BisolaIbiwoye https://github.com/BisolaIbiwoye Additionally, I do not see Week 2 tasks having been fully completed, you still have to work on reproducibility. — Reply to this email directly, view it on GitHub <#990 (comment)>, or unsubscribe https://github.com/notifications/unsubscribe-auth/AX4XGNALLR34M6AEM6QDVZLYY773HAVCNFSM6AAAAABEHRREIGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMBWGQ3TENRZG4 . You are receiving this because you were mentioned.Message ID: @.>

No worries @BisolaIbiwoye , I hope you are feeling better now. :) 🤗

DhanshreeA commented 3 months ago

Hi @BisolaIbiwoye I think you have used the wrong dataset for model reproducibility task. You should be using the datasets within the Finetuning folder here which are the actual test datasets. What you are using is the "External validation" dataset for the model. Please update to use both test_rev and test_all datasets and then compare your plots and performance metrics against the ones in the paper.

BisolaIbiwoye commented 3 months ago

OK I will work on it. I am confused here, am I to join the test_all.csv and test_rev.csv together and run predictions on it or I am to run predictions on them separately?

On Fri, 22 Mar 2024, 08:34 Dhanshree Arora, @.***> wrote:

Hi @BisolaIbiwoye https://github.com/BisolaIbiwoye I think you have used the wrong dataset for model reproducibility task. You should be using the datasets within the Finetuning folder here https://github.com/GIST-CSBL/BayeshERG/tree/main/data/Finetuning which are the actual test datasets. What you are using is the "External validation" dataset for the model. Please update to use both test_rev and test_all datasets and then compare your plots and performance metrics against the ones in the paper.

— Reply to this email directly, view it on GitHub https://github.com/ersilia-os/ersilia/issues/990#issuecomment-2014610438, or unsubscribe https://github.com/notifications/unsubscribe-auth/AX4XGNEZEXLDFO56SJZ4PATYZPULFAVCNFSM6AAAAABEHRREIGVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDAMJUGYYTANBTHA . You are receiving this because you were mentioned.Message ID: @.***>

GemmaTuron commented 3 months ago

Hi @BisolaIbiwoye I am not familiar with the repo, you can look at the readme file and see if they provide information on the test datasets. In any case, you can try to join them and see if they are actually replicates, and drop any duplicates before making the predictions. Let's aim at closing this this week so we can focus on the final application