Closed Tshifhumulo10 closed 11 months ago
Hi @Tshifhumulo10 can you confirm if you were able to run Ersilia with the simplest model and obtain the desired output as mentioned in the instructions?
Hi @Tshifhumulo10 can you confirm if you were able to run Ersilia with the simplest model and obtain the desired output as mentioned in the instructions?
Hi @DhanshreeA thank you for checking up. I was able to run Ersilia with the simplest model and obtained output as mentioned in the instructions. I was able to do so after reading the instruction given by @HellenNamulinda, https://github.com/ersilia-os/ersilia/issues/820#issuecomment-1744307345
Retrospection
I am using Windows operating system, so I downloaded Windows Subsystem for Linux (WSL) with an Ubuntu distribution as instructed
Installed the gcc compiler:
Installed miniconda in ubuntu:
Installed Git and GitHub CLI:
Used GitHub CLI to login github:
Installed Git LFS from Conda:
Activated Git LFS:
Installed Isaura data lake:
Installed Docker:
Installed the Ersilia tool:
Installed the Ersilia Python package:
Fetched a model:
Served a model:
Run the model:
Output:
Motivation statement to work at Ersilia
I am drawn to the Ersilia project for several compelling reasons. First and foremost, I believe in the transformative power of data. In the words of Clive Humby, "Data is the new oil," and I've witnessed firsthand how harnessing data can drive innovation and impact lives positively.
With a background in Biochemistry and Microbiology, I ventured into the world of data science through intensive coursework. As I navigated through various projects and opportunities, I was pleasantly surprised to stumble upon Ersilia. What immediately struck me was the project's harmonious blend of cutting-edge technology and domain expertise.
Having spent time working in a laboratory, I understand the tangible difference Ersilia's projects make in the scientific community and beyond. I am passionate about being part of a team that bridges the gap between data scientists and experimental researchers, making AI/ML expertise accessible to scientists worldwide. This aligns perfectly with my values of transparency, knowledge sharing, and collaboration.
Ersilia's dedication to advancing technology in machine learning, cloud computing, and data privacy promises to keep me at the forefront of technological progress—a prospect that excites me greatly. Furthermore, the project's unique fusion of chemistry, molecular biology, and computational pharmacology, combined with a deep commitment to global health, presents a compelling challenge and an opportunity to create meaningful change.
Lastly, the focus on open-source initiatives aimed at generating low-cost drugs is a testament to Ersilia's commitment to making a positive societal impact. I am honored and eager to contribute my skills and enthusiasm to this noble cause. I am deeply motivated to join the Ersilia project because it embodies the values and interests of innovation, collaboration, technology, and making a tangible difference in the world through data-driven solutions. I eagerly anticipate the opportunity to contribute to this remarkable endeavor.
Hi @Tshifhumulo10 thank you for the updates. Please record this as your first contribution on the Outreachy website, and proceed with the tasks from week 2.
Hi @Tshifhumulo10 thank you for the updates. Please record this as your first contribution on the Outreachy website, and proceed with the tasks from week 2.
Hi @DhanshreeA, I will do just that.
Hey @Tshifhumulo10 all good? Do you need any help?
Hey @Tshifhumulo10 all good? Do you need any help?
Hey @DhanshreeA, thank you for checking up. I have been trying to use "NCATS Rat Liver Microsomal Stability" but it took forever to load, So I have switched to STOUT.
From the suggested list, I was drawn to the “NCats Rat Liver Microsomal Stability” because I have always been fascinated about how drugs operate. However, it was taking forever to install, so I switched to STOUT. Manually translating the SMILES to their IUPAC names can be challenging, thus STOUT is pivotal for such endeavor. I have chosen STOUT model because I understand the impact that these models can make for chemists and researcher in saving their time, reducing errors and increasing the credibility of their work since these models have an accuracy of 90%.
First I created a new environment called STOUT and that environment included python =3.8
conda create --name STOUT python=3.8
I set the environment as the current Python environment for my session, allowing me to work within that specific environment
conda activate STOUT
I installed the package from a specific Conda channel named "decimer”
conda install -c decimer stout-pypi
Installed STOUT from the git repository
pip install git+https://github.com/Kohulan/Smiles-TO-iUpac-Translator.git
I imported the translate_forward and transalate reverse from STOUT. Translate forward is used translate SMILES to IUPAC name and Translate reverse is used to translate IUPAC names to Smiles
I translated the SMILES to IUPAC names using the code below :
SMILES = "CN1C=NC2=C1C(=O)N(C(=O)N2C)C"
IUPAC_name = translate_forward(SMILES)
print("IUPAC name of "+SMILES+" is: "+IUPAC_name)
Output:
I translated the IUPAC names to SMILES using the code below:
IUPAC_name = "1,3,7-trimethylpurine-2,6-dione"
SMILES = translate_reverse(IUPAC_name)
print("SMILES of "+IUPAC_name+" is: "+SMILES)
Output:
Downloaded Essential Medical List csv by Ersilia and copied it to my directory.
Imported pandas and STOUT
import pandas as pd
from STOUT import translate_forward, translate_reverse
df=pd.read_csv('eml_canonical (3).csv')
name_smiles = list(df['smiles'].head(10))
IUPAC_nm=[]
for i in name_smiles:
IUPAC_name = translate_forward(i)
IUPAC_nm.append(IUPAC_name)
DF=pd.DataFrame({"SMILES": name_smiles,"Predicted IUPAC": IUPAC_nm})
for i in range(0, 10):
#print("IUPAC name of " + DF.loc[ i, "SMILES"] + " is: " + DF.loc[i, "Predicted IUPAC"])
-OUTPUT
Smiles_nm=[]
for a in IUPAC_nm:
SMILES = translate_reverse(IUPAC_name)
Smiles_nm.append(SMILES)
DF1= pd.DataFrame({"SMILES": name_smiles,"Predicted Smiles" :Smiles_nm})
for i in range(0, 10):
print("SMILES of "+DF1.loc[i, "Predicted IUPAC"]+" is: "+ DF1.loc[i, "Predicted Smiles"])
OUTPUT:
I got the name of the model from the Ersilia Model Hub implementation
eos4se9
I retrieved the name of the SMILES [Downloaded [Essential Medical List] (https://github.com/ersilia-os/ersilia/blob/master/notebooks/eml_canonical.csv)]
I fetched the model
ersilia -v fetch eos4se9
I served the model
ersilia -v serve eos4se9
Run the SMILES
1.ersilia -v api run -i 'Nc1nc(NC2CC2)c3ncn([C@@H]4C[C@H](CO)C=C4)c3n1'
Output:
ersilia -v api run -i 'C[C@]12CC[C@H](O)CC1=CC[C@@H]3[C@@H]2CC[C@@]4(C)[C@H]3CC=C4c5cccnc5'
Output:
ersilia -v api run -i 'CC(=O)Nc1sc(nn1)[S](N)(=O)=O'
Output
ersilia -v api run -i 'CC(O)=O'
Output
ersilia -v api run -i 'CC(=O)N[C@@H](CS)C(O)=O'
Output
Hi @Tshifhumulo10 thank you for the updates so far. Could you comment on the comparison between results obtained using original implementation of STOUT vs the Ersilia implementation? For example, do you notice any differences if any?
You can also move to week 3 tasks afterwards.
Hello,
Thanks for your work during the Outreachy contribution period, we hope you enjoyed it! We will now close this issue while we work on the selection of interns. Thanks again!
Week 1 - Get to know the community
Week 2 - Install and run an ML model
Week 3 - Propose new models
Week 4 - Prepare your final application