Closed Iqra350 closed 6 months ago
As an AI engineer, I am elated by the prospects Ersilia presents in the realm of data science. Hailing from a region where opportunities to engage with machine learning projects are scarce, I've often felt constrained by the limited avenues available to showcase and apply predictive models. My initial role as a data analyst left me yearning for deeper involvement in this dynamic and ever-evolving field. The scarcity of opportunities often led to a sense of frustration, compounded by the scarcity of roles in this burgeoning area.
However, Ersilia stands as a beacon of opportunity, offering hands-on experience and a collaborative environment that fosters growth. Working alongside esteemed scientists and fellow enthusiasts reignites my passion for data science and reinforces my programming proficiency in handling vast datasets.
Throughout my academic journey in computer science, I gravitated towards the data science domain, delving into research papers and expanding my expertise in machine learning. Despite the well-documented gender disparities in AI/ML professions, where only 26% of professionals are women, I remain resolute in my determination to excel and contribute meaningfully. Adapting to challenges faced by women in the field, I've explored diverse roles, including frontend engineering. Even after relocating, navigating through job opportunities presented its own set of hurdles.
The ethos and objectives of Ersilia resonate deeply with me. The commitment to democratize AI/ML models and ensure inclusivity in accessing advancements aligns with my personal values. Being part of such a visionary environment promises not only to make a meaningful impact but also to propel me towards new heights of influence.
I am profoundly grateful for the opportunity to collaborate with esteemed scientists and benefit from the mentorship of experienced PhD professionals. This nurturing environment not only strengthens my foundational knowledge but also fosters a culture of compassion that enriches the learning experience.
In conclusion, the Ersilia Model Hub internship embodies a pivotal opportunity for me to fuel my passion for data science and contribute to transformative projects that hold the potential to shape the future.
Have not found any issue till now.
Have not found any issue till now.
Well-done, you've done a great work so far. You have completed the week 1 task. I might just suggest you going through Ersilia model hub (https://www.ersilia.io/model-hub) to get familarize with the model since we are waiting on the team to finalize week 2 and week 3 task.
Have not found any issue till now.
Well-done, you've done a great work so far. You have completed the week 1 task. I might just suggest you going through Ersilia model hub (https://www.ersilia.io/model-hub) to get familarize with the model since we are waiting on the team to finalize week 2 and week 3 task.
yeah sure i am working on it. :)
Hi @Iqra350. I noticed you're yet to start week 2 tasks. Are you facing any challenges? If yes, you can state them here, I will be glad to put you through
Hi @Iqra350 please let us know if you want to continue your work during this contribution period. Otherwise we can close this issue and focus on other applicants.
@DhanshreeA Sorry for the inconvenience. I will update all task for week 2 and 3 with in this week.
Hi @Iqra350 I do not see any updates yet. I am closing this issue because it will be too late to catch up with the other applicants at this point. Thank you for your time and efforts.
I am working on notebook will upload all task all together.
This is the proof.
Sorry for the late response i thought i have to upload all the tasks at once.
I have Uploaded the project for Week 2 and for Week 3 will upload in one or two days. Before the deadline.
You can Access my progress in link
This repository contains the code and documentation for validating the "eos74bo" model from the Ersilia Model Hub, focusing on Tasks of the internship project, which involves checking model bias.
The goal of this project is to validate the accuracy and reproducibility of the "eos74bo" model, which predicts ADME properties of small molecules. Task 1 specifically involves checking for model bias by running predictions for a list of 1000 diverse molecules and plotting the results in a scatter plot. and Task 2 reproducibity of results and Task 3 is extrnal data validation from Selected paper
Kinetic aqueous solubility (μg/mL) was experimentally determined using the same SOP in over 200 NCATS drug discovery projects. A final dataset of 11780 non-redundant molecules and their associated solubility was used to train a SVM classifier. Approximately half of the dataset has poor solubility (< 10 μg/mL), and two-thirds of these low soluble molecules report values of < 1 μg/mL. A subset of the data used is available at PubChem (AID 1645848) https://pubchem.ncbi.nlm.nih.gov/bioassay/1645848#section=Result-Definitions.
README.md
: This file, providing an overview of the project and instructions for reproducing the work.src/
: Directory containing source code.data/
: Directory containing datasets used for validation.results/
: Directory containing output plots and any other result files.LICENSE
: License file for the repository..gitignore
: File specifying intentionally untracked files to ignore.Select the paper and get the dataset that used in that paper and make prediction on that.
https://slas-discovery.org/action/showPdf?pii=S2472-5552%2822%2906765-X that used the Solubilty dataset:
In this notebook, I am loading a list of molecules I obtained from PubChem for solubilty check of drug, and will replicate the results for Ersilia Hub Model mentioned in link and for the git hub code of external sourse mentioned in linkprocessing them to make sure I have:
OutCome:
PAMPA, parallel artificial membrane permeability assay: PAMPA is a laboratory test used in drug development to assess how easily a drug can pass through cell membranes. It helps researchers understand the ability of a drug to be absorbed into the bloodstream, which is important for determining its effectiveness.
the code to download the dataset is below
from tdc.single_pred import ADME
data = ADME(name = 'PAMPA_NCATS')
split = data.get_split()
The github Link for the entire code is Notebooks for All Weeks
Week 1 - Get to know the community
Week 2 - Get Familiar with Machine Learning for Chemistry
Week 3 - Validate a Model in the Wild
Week 4 - Prepare your final application