ersilia-os / ersilia

The Ersilia Model Hub, a repository of AI/ML models for infectious and neglected disease research.
https://ersilia.io
GNU General Public License v3.0
225 stars 146 forks source link

✍️ Contribution period: Joyesh Banerjee #832

Closed joiboi08 closed 1 year ago

joiboi08 commented 1 year ago

Week 1 - Get to know the community

Week 2 - Install and run an ML model

Week 3 - Propose new models

Week 4 - Prepare your final application

joiboi08 commented 1 year ago

Hello everyone. Excited to start contributing. I am now starting with installation of the Ersilia Model Hub! I will also work on a motivation letter alongside it and update both of their progress shortly. I am working on a Windows machine and following the instructions mentioned here!

joiboi08 commented 1 year ago

WEEK 1 - Updates

Task 1 - I joined Esilia's Slack channel from their Outreachy landing page and was welcomed by a community of warm peers and team leads! It was reassuring and exciting to be part of something like this.

Task 2 - Opened this issue with success! :)

Task 3 - Since I am using a Windows platform, I installed WSL and Ubuntu terminal environment as mentioned here. Faced a small issue in my Ubuntu terminal not recognising WSL so I had to manually enable it from Windows Features. It worked fine afterwards. Continued through the mentioned steps.

I installed all prerequisites -

After the prerequisites, I installed the Ersilia tool! Here I was faced with an issue because even after I was done with my installation and I had activated the conda environment - I could not run ersilia --help or ersilia --catalog I determined this to be because my WSL version was not updated so I updated it to 2 and made sure Ubuntu was using WSL2. I also configured Docker Desktop to use WSL2 after my update. This fixed my problem and I was able to finally install Ersilia!!

The installation guide as well as my peers like @leilayesufu who shared a detailed documentation of their journey were of immense help wherever I got stuck and I am grateful to them.

Finally, onto testing!

joiboi08 commented 1 year ago

Task 3 cont.

First few testing steps were calling a catalog function and running a simple model. Alas, I am facing issues in calling ersilia catalog where it mentions a Errno:101 Network Unreachable error. I have attached a log file below. myfile.log ersilia --help works fine. So I moved on to ersilia -v fetch eos3b5e which is also not working with a different error (log file below) fetchLog.log

joiboi08 commented 1 year ago

I have found a solution to this issue. Certain service providers in India block githubraw and that was causing the Errno101. People may try switching to a different service provider or use a VPN but a more feasible solution is changing your DNS to - Cloudfare DNS 1.1.1.1 1.0.0.1 for ipv4 and 2606:4700:4700::1111 2606:4700:4700::1001 for ipv6.

Changing to Google DNS is also working.

This has worked well for me and I have successfully tested and used a simple model deployment of Ersilia, getting the desired result that was specified in the instructions. As was instructed on Slack, I have used Python version 3.10.12 and did not install isaura (thanks to @carcablop) , hence I did not face this issue.

This was a great learning experience for me. I faced errors where I thought everything would go smoothly whereas places where I expected troubles were free flowing. A big bag of my gratitude to the supportive peers that took the time to upload their experiences comprehensively and the community leaders for taking their time to go through everyone's issues to give prompt solutions here and on Slack.

I'm excited for what is to come next! I will add shortly my motivations for applying to Outreachy and what I aim to achieve.

I will now mark task 3 as completed!

joiboi08 commented 1 year ago

Motivation Statement - Task 4

My name is Joyesh Banerjee. I am an engineering graduate from India and did it in electronics and communication engineering. I started out with C and Java but shifted to Python after a while because I was garnering interest in data analysis and later data science in general. I did a few college projects in which we trained a prediction model based on open source datasets and I found the work lively and rewarding. I come from a lower middle class family and have always lived by the skin of our teeth but all thanks to God, we have come far. Every parent works through their bones to give their children a better platform to grab opportunities than they had and I believe this is such a platform that they envisioned for us.

I came across Outreachy as a recommendation from a friend who applied previously and I was warmed by how much soul they had as an organization. They were incredibly inclusive and gave me a chance to really tell my story in my application which I appreciated, so did many of my peers I am sure of it. So I was very happy I was given a chance as a contributor here, because when I was going through the projects - I felt a similar blow of warmth from Ersilia. This was corroborated by the cordial and supportive team I found myself in when I joined their channels. They were prompt and provided quick resolutions for issues. The best part was - even when many of us were facing issues in some initial tasks, in the time the solution had not been found the team was interactive and didn't keep us in the dark. I already have some experience with training models so I firmly believe my time in the internship will be the perfect incubator for my skills to grow rapidly and showcase themselves in the best of ways - helping people in need. And for that I am primed and ready to learn and apply myself to the fullest. After the internship period has ended, I intend to keep applying my competence here by working with the community to plan meaningful contributions. I also want to challenge myself by learning new tech stacks like cloud (AWS) to make deployment of these models more efficient for users and further supplement Ersilia's growth. I want to eventually learn new skills to find a way to enhance current processes.

I have also seen my fair share of medical issues and what hurts most is medical incompetence. I lost my grandmother to undiscovered side effects during her treatment which blindsided the family and the doctors on her case. When I went through Ersilia's objectives and intentions with this technology, I couldn't help but think what could have happened if someone had thought of this a decade earlier.

As the saying goes - "The best time to plant a tree was 10 years ago. The second best time is now."

This community is working on a novel goal that will help countless people and I wish to become part of that effort to my utmost. It will swell my heart with joy if I can help further this idea to reality - so that a decade from now maybe one less child will wish someone had thought of something today.

joiboi08 commented 1 year ago

Submit your first contribution to Outreachy - Task 5

After providing a detailed motivational letter, I have submitted a contribution report through Outreachy and linked this issue as instructed! The contribution has been recorded successfully.

This marks the end of all Week - 1 tasks and is hence brought to closure!

DhanshreeA commented 1 year ago

Hello @joiboi08 thank you for the detailed updates. If you'd like, you can get started with the tasks from week 2 now. :)

joiboi08 commented 1 year ago

Hello @joiboi08 thank you for the detailed updates. If you'd like, you can get started with the tasks from week 2 now. :)

Thank you! I'll update my progress in week 2 shortly.

joiboi08 commented 1 year ago

Week - 2

After completing week - 1 tasks, attending a wonderful and informative session and some personal time - I have begun work on week 2!

Tasks of this week pit us the closest to real-internship work so far. Understood summary of tasks -

This is my interpretation of the Week - 2 tasks. If I misinterpreted something I am happy to be corrected.

Task 1 - Choose a model, explain why and run it locally

For this task, I was divided between PPBopt model and the STOUT model. My understanding of the former is that it is a prediction engine that predicts how well a compound binds with blood for transportation to target sites, finding important use cases in optimizing drug development costs and time. This model was interesting and is also the closest to my experience as I have worked on prediction models before.

However, I wanted to try something different and the STOUT model peaked my interest. My understanding of it is converts a compound's SMILE name (which is essentially the ASCII-symbol representation of its structure so it is machine readable) to its IUPAC defined name and vice-versa. This model was also much better documented and I felt I saw a more concrete roadmap here.

Hence, I opted to work with the STOUT model.

joiboi08 commented 1 year ago

Task 2 - Install and run the STOUT model locally

To start with, I am following these steps for the STOUT model installation.

First, I open my local Linux environment that I setup in the Week - 1 tasks. I run some -- version checks to ensure I am using appropriate versions - Python 3.11.4 Conda 23.9.0 and WSL 2

All good! Onto the installation!

Two tries were made to fetch repo data but failed. Log file for the failure - fail.log It could not find the pystow package. I tried doing pip install pystow but it did not work - it gave the same error again.

So I tried using alternate methods mentioned in the repo - pip install STOUT-pypi It successfully installed all required packages! (worth ~624MB!)

Since the model was now installed, I was ready to test!

joiboi08 commented 1 year ago

While testing, I am running into some issues.

I needed a dataset for testing and I found a demo dataset in the STOUT repo and tried to use it. However, it keeps giving me a ImportError - log attached below. logs.log

EDIT - This is a post-Task 2-completion edit. Since I got the model working and the test file ran fine, I wanted to test it with a more comprehensive test file like the demo file I mentioned here. However, when I ran it - just like with VsCode, the Ubuntu terminal also went into a no-return suspend state and my CPU usage hit a constant 100% again. I'll try and wait an hour like this and update progress here.

joiboi08 commented 1 year ago

I was primarily running this on Ubuntu but I have since switched to VSCode (and its CLI) for better maneuverability. I got the same ImportError referenced above. I found a solution by changing the relative import in the stout.py file to an absolute import :

from .repack import helper to from repack import helper

This allowed me to move forward. After I ran the code again, it finally downloaded the model and gave me success message (in the VSCode CLI) that model was loaded. But when it compiling the code, it tagged the IUPAC_names_test.txt file with a FileNotFoundError So I switched the working directory to cd STOUT which solved it.

However, after this I ran the code hoping no issues should persist now. But it is in a suspended state and has not given an output in the CLI. I checked Task Manager for any clues and it showed a constant 100% CPU usage the whole time I was viewing Task Manager.

I am going to try again.

After much trying - Success!!

image

image

image

image

image

Tested successfully!!

This period is shaping up to be a concrete learning experience for me. It gets a little mental sometimes but it is always rewarding when I manage to pull through!

Task 2 - Complete.

joiboi08 commented 1 year ago

Task 3 - Running the model with EML as the input dataset

Now that we have setup our model and tested it once, we use it in a pseudo-real time scenario.

import csv
from STOUT import translate_forward  

# intention is to convert the EML csv into a list version of EML

with open("eml_canonical.csv", newline='') as eml_csv :

    reader = csv.reader(eml_csv)   # returns each row of EML as a list

    eml_list = list(reader)  # list of each eml row as a list 
can_smiles_list = []  # empty list that will hold canonical smiles  

for name in eml_list[1:21] : # includes first 20 SMILE rows excluding the header

    can_smiles_list.append(name[2])   # we have a list of canonical smiles to be translated
iupac_ = [] # empty list that will hold translated iupac names

for name in can_smiles_list : 

    result = translate_forward(name)

    iupac_.append(result)

πŸ”΄ This where I am facing an issue. I am running this in VSCode and it does NOT recognise from STOUT as a module. I have made sure my working directory is in the conda environment and run my code from there. But it is not recognising it. Any help is appreciated!

The problem is resolved! 🟒 🟒

After a discussion with my peer @PromiseFru, I was able to conclude that the problem was I had to separately activate the STOUT library in the VSC CLI again. Since my working dir was in the conda env made during the installation of the model, I thought this wouldn't be an issue. To fix it I can do 2 things -

I chose to do the latter as it saved time but I will install conda on VSC for future work.

for i in iupac_ :
    print(i)

image

# writes a list of translated iupac names to the file 'predicted_iupac.csv'
with open("predicted_iupac.csv", "w") as trans_iupac :
    writer = csv.writer(trans_iupac)
    for i in iupac_ :
        writer.writerow(i)

Working data and Result data CSV files :

Full code for your perusal

 # Since the EML file has canonical SMILE names 
# we import only translate_forward to translate from SMILES to IUPAC 
import csv
from STOUT import translate_forward  

#! CONVERTING EML CSV TO EML LIST OF LISTS
# intention is to convert the EML csv into a list version of EML
with open("eml_canonical.csv", newline='') as eml_csv :
    reader = csv.reader(eml_csv)   # returns each row of EML as a list
    eml_list = list(reader)  # list of each eml row as a list 

#! EXTRACTING to-be-translated CANONICAL FORMS FROM SOURCE EML LIST
can_smiles_list = []  # empty list that will hold canonical smiles  
for name in eml_list[1:21] : # includes first 20 SMILE rows excluding the header
    can_smiles_list.append(name[2])   # we have a list of canonical smiles to be translated

iupac_ = [] # empty list that will hold translated iupac names
for name in can_smiles_list : 
    result = translate_forward(name)
    iupac_.append(result)

# writes a list of translated iupac names to the file 'predicted_iupac.csv'
with open("predicted_iupac.csv", "w") as trans_iupac :
    writer = csv.writer(trans_iupac)
    for i in iupac_ :
        writer.writerow(i)

Task 3 Completed!

joiboi08 commented 1 year ago

Task 4 - Docker Deployment of the Ersilia Hub implementaion of the STOUT Model

Now, the model is ready to use!

ersilia api run -i <<input_file.csv>> -o <<desired_output_file_name.csv>>

ersilia -v api run -i task3.csv -o result3.csv

Please advise @DhanshreeA @carcablop @HellenNamulinda

errorLog.log

leilayesufu commented 1 year ago

Have you tried giving it a single input as opposed to the entire EML file to test it?

joiboi08 commented 1 year ago

Hi @leilayesufu I haven't processed the entire file yet. I was feeding it a modified dataset task3.csv of 2 inputs as a test before giving it the entire EML set.

leilayesufu commented 1 year ago

Okay, try testing it with a single input directly as though. ersilia -v api run -i "Nc1nc(NC2CC2)c2ncn([C@H]3C=CC@@HC3)c2n1" not through the file

joiboi08 commented 1 year ago

I've run into a worse problem. I am unable to fetch or serve models. I keep getting the connection reset by peer error without fail. I have reinstalled the environment multiple times without this changing. ConnectionResetLog.log

leilayesufu commented 1 year ago

I'm going to try to do it, and i'll get back to you

joiboi08 commented 1 year ago

Thank you so much. I look forward to hearing your experience. I am using Ubuntu 22.04

leilayesufu commented 1 year ago

HI, so i fetched the model and served it as seen

fetch and serve

then i ran ersilia card eos4se9, the output showed as seen here

"Code": "$ ersilia serve smiles2iupac\n$ ersilia api -i 'CCCOCCC'\n$ ersilia close",

So to run predictions, i just did ersilia api -i "CCCC" and i got the output below here

This was just a simple texting although @Promisefru ran it with some inputs from the EML file and it gave him a null output

joiboi08 commented 1 year ago

Hi @leilayesufu Thank you for trying this out for yourself. I reinstalled my environment and tried your steps to-the-letter, but I kept getting one of three errors when I tried to use ersilia api -i "CCCC

leilayesufu commented 1 year ago

HI, I'm thinking it could be your network then

joiboi08 commented 1 year ago

Hi @leilayesufu I have a strong connection and I have also made sure I am not running into this error again as I can view githubraw files. I previously was able to fetch and run models but I have only recently been unable to do so.

leilayesufu commented 1 year ago

Hi, i would suggest removing the entire environment and starting afresh, or you could wait for a mentors opinion. @DhanshreeA

DhanshreeA commented 1 year ago

Hi @joiboi08 as discussed over Slack, let me look into this more. I will get back to you by tomorrow.

joiboi08 commented 1 year ago

Thank you @DhanshreeA and @leilayesufu. I am looking forward to the updates. Meanwhile is it ok if I move on to Week - 3 tasks for now?

leilayesufu commented 1 year ago

Hi, since the problem is a geographical one. I'll suggest using a vpn and changing your location to complete your week 2 tasks. Ofcourse, you'll need the go ahead from @DhanshreeA

joiboi08 commented 1 year ago

Hi @leilayesufu, thank you for your suggestion. I ran a VPN and did a fresh install of ersilia and the conda environment as well as the git packages. It is now successfully able to fetch and serve models so I am a little relieved. I believe my peer @Ajoke23 also mentioned this on Slack, thank you as well. @DhanshreeA VPN is working as an interim solution for the regional service outages.

Currently, I am again facing this problem - TypeError: object of type 'NoneType' has no len() I am trying some solutions and will update here.

🟒 🟒 SOLUTION - On the advice of my peer @AlphonseBrandon I added headers to my input file and it worked. Thank you so much.

Now, I feed the this file into my fetched model eos4se9 using the command $ ersilia -v api run -i 'eml.csv' -o 'result.csv'

BUT

The first 9 rows do not have a translation. I ran it again and this time I did NOT have any rows translated. During both the processes, two things happened consistently -

  1. Batch prediction failed and it swtiched to individual prediction
  2. I got a 504 error from every single row that failed to translate

$ docker cp er_task3.csv eos4se9_7a24:/root

Thank you @leilayesufu @PromiseFru for helping me figure out the container dir!

To access this container through Ubuntu, I use the command -

$ docker exec -it eos4se9_7a24 sh

It is! Great!

# ersilia -v api run -i er_task3.csv -o er_result.csv

$ docker cp eos4se9_7a24:/root \\wsl.localhost\Ubuntu\home\joyesh\miniconda3\envs\ersilia

Here, the container files are copied over to the mentioned destination and we can easily find our result file here.

er_result.csv

Successfully predicted all SMILES names to IUPAC!

Comparison between STOUT implementation and Ersilia implementation

After getting both results, I wanted to combine both result files into a single csv or excel file. For that, I wrote some python code to :

I used the csv module again

import csv 

with open("er_result.csv", newline='') as ers :
    ers_base_list = list(csv.reader(ers))  # list of list of each ersilia translation row

ers_list = []
for name in ers_base_list[1:] :
    ers_list.append(name[2])  # extracting only rows under iupacs_names column

with open("111predicted_iupac111.csv", newline='') as stout : 
    stout_base_list = list(csv.reader(stout)) # list of list of stout translation rows

stout_list = []
for name in stout_base_list : # no headers in this file
    stout_list.append(name[0]) # list of stout translations
result_list = []
for i in range(0,20) :  # because 20 SMILES names were translated
    result_list.append(stout_list[i]) # adding STOUT IUPAC
    result_list.append(ers_list[i])   # adding Ersilia IUPAC
with open("comparison_result.csv", "w") as comp :
    writer1 = csv.writer(comp)
    for i in result_list :
        writer1.writerow([i])    

My Interpretations

(3S,8R,9S,10R,13S,14S)-10,13-dimethyl-17-pyridin-3-yl-2,3,4,7,8,9,11,12,14,15-decahydro-1H-cyclopenta[a]phenanthren-3-ol

(1S,2S,5S,10R,11R,14S)-5,11-dimethyl-5-pyridin-3-yltetracyclo[9.4.0.02,6.010,14]pentadeca-7,16-dien-14-ol

This week was the greatest challenge yet as I made myself familiar with new technology and got stuck A LOT!! However, it was joyous to see myself progress. Excited for the next tasks!

Marking Week - 2 complete!

joiboi08 commented 1 year ago

Week - 3

Marks the start of some real field work!

First Model Proposition

PIGNet2 - A Versatile Deep Learning-based Protein-Ligand Interaction Prediction Model for Binding Affinity Scoring and Virtual Screening

My interpretation of the model :

Relevance to Ersilia

DhanshreeA commented 1 year ago

Hi @leilayesufu, thank you for your suggestion. I ran a VPN and did a fresh install of ersilia and the conda environment as well as the git packages. It is now successfully able to fetch and serve models so I am a little relieved. I believe my peer @Ajoke23 also mentioned this on Slack, thank you as well. @DhanshreeA VPN is working as an interim solution for the regional service outages.

Currently, I am again facing this problem - TypeError: object of type 'NoneType' has no len() I am trying some solutions and will update here.

🟒 🟒 SOLUTION - On the advice of my peer @AlphonseBrandon I added headers to my input file and it worked. Thank you so much.

* Input file
  [eml.csv](https://github.com/ersilia-os/ersilia/files/12895576/eml.csv)
  This file has the first 20 rows of canonical SMILES names (excluding the header row) to match the 20 rows of input used in 3rd party STOUT model implementation.

Now, I feed the this file into my fetched model eos4se9 using the command $ ersilia -v api run -i 'eml.csv' -o 'result.csv'

* Now, I get the result output file
  [result.csv](https://github.com/ersilia-os/ersilia/files/12895771/result.csv)

BUT

The first 9 rows do not have a translation. I ran it again and this time I did NOT have any rows translated. During both the processes, two things happened consistently -

1. Batch prediction failed and it swtiched to individual prediction

2. I got a 504 error from every single row that failed to translate

* I always get these DEBUG logs :
11:24:14 | DEBUG    | Starting Docker Daemon service
11:24:14 | DEBUG    | Creating temporary folder /tmp/ersilia-nb24uaof and mounting as volume in container
11:24:14 | DEBUG    | Image ersiliaos/eos4se9:latest is available locally
11:24:14 | DEBUG    | Using port 46089
* I didn't really have experience with Docker, but from what I could tell, the inputs were given to a remote **docker container** running the `eos4se9 model` which in turn returned predictions. And googling the `error 504` informed me it is a timeout error. So basically, I was not getting translated outputs because the container kept getting timed out.

* I searched around and found two solutions -
  1. Instead of communicating with a remote container, I run the predictions directly from within the container.
  2. Increase the nginx request timeout.

* I chose **option 1** as it will allow me to gain more experience with Docker implementation.

* So - as I fetch and serve the models on Ubuntu, I see a corresponding container being created on Docker Desktop. From here, I can get my `container ID` to call it in Ubuntu.
  From Docker Desktop, my current running container has id - `eos4se9_7a24`

* Since processing the entire EML dataset will take an impractical amount of time (mainly due to hardware limitations), I have taken the first 20 rows of the dataset as input for both the STOUT 3rd party model and the Ersilia Hub Model.

* I already have a modified dataset [er_task3.csv](https://github.com/ersilia-os/ersilia/files/12907831/er_task3.csv) so the step to take here is to copy this file into the working dir of my container.

* I was able to complete this step using the docker `cp` command -

$ docker cp er_task3.csv eos4se9_7a24:/root

Thank you @leilayesufu @PromiseFru for helping me figure out the container dir!

* Now the modified dataset er_task3.csv is copied into the working dir of the container.

To access this container through Ubuntu, I use the command -

$ docker exec -it eos4se9_7a24 sh

* Check to see if the dataset is present using
  `# ls`
  ![image](https://user-images.githubusercontent.com/94055810/275242982-3aad8167-456f-4d01-b69e-a434d08b3e22.png)

It is! Great!

* We input the dataset here and run the model and output the result into a file

# ersilia -v api run -i er_task3.csv -o er_result.csv

* Now, the generated result is still in the container and to access it, we need to copy it to our local system

$ docker cp eos4se9_7a24:/root \\wsl.localhost\Ubuntu\home\joyesh\miniconda3\envs\ersilia

Here, the container files are copied over to the mentioned destination and we can easily find our result file here.

er_result.csv

Successfully predicted all SMILES names to IUPAC!

Comparison between STOUT implementation and Ersilia implementation

After getting both results, I wanted to combine both result files into a single csv or excel file. For that, I wrote some python code to :

* turn both .csv files into respective lists

I used the csv module again

import csv 

with open("er_result.csv", newline='') as ers :
    ers_base_list = list(csv.reader(ers))  # list of list of each ersilia translation row

ers_list = []
for name in ers_base_list[1:] :
    ers_list.append(name[2])  # extracting only rows under iupacs_names column

with open("111predicted_iupac111.csv", newline='') as stout : 
    stout_base_list = list(csv.reader(stout)) # list of list of stout translation rows

stout_list = []
for name in stout_base_list : # no headers in this file
    stout_list.append(name[0]) # list of stout translations
* combine those lists into **one** list of format :  `[<STOUT iupac name>, <Ersilia iupac name>]`
result_list = []
for i in range(0,20) :  # because 20 SMILES names were translated
    result_list.append(stout_list[i]) # adding STOUT IUPAC
    result_list.append(ers_list[i])   # adding Ersilia IUPAC
* turn that list into a single .csv file
with open("comparison_result.csv", "w") as comp :
    writer1 = csv.writer(comp)
    for i in result_list :
        writer1.writerow([i])    
* The resultant comparison file
  [comparison_result.csv](https://github.com/ersilia-os/ersilia/files/12907649/comparison_result.csv)

My Interpretations

* The STOUT third party model was for me more modular in terms of determining an output format. The output file generated [as seen here](https://github.com/ersilia-os/ersilia/files/12896825/111predicted_iupac111.csv) does NOT have excess columns, only the translated IUPAC names. This made it easier to work with as it required less cleaning/prepping.

* The Ersilia Model however gives a verbose multi-column list without an opportunity to change that output since the output file is generated directly by the model. Whereas in the STOUT model, we imported the STOUT module and used the `translate forward` function in the code we write from scratch.

* On the flipside, this makes the Ersilia Model implementation more time efficient as no `python` is needed to generate an output file. This has greater weight as deployment time and efficiency is higher here.

* As for the result contents itself, there is mostly no difference save for a few situations. The models perform similarly/identically for smaller, less complex inputs like `CCCC` or `CC(=O)O`

* But minor differences start to show for larger, more complex inputs as seen here :
(3S,8R,9S,10R,13S,14S)-10,13-dimethyl-17-pyridin-3-yl-2,3,4,7,8,9,11,12,14,15-decahydro-1H-cyclopenta[a]phenanthren-3-ol

(1S,2S,5S,10R,11R,14S)-5,11-dimethyl-5-pyridin-3-yltetracyclo[9.4.0.02,6.010,14]pentadeca-7,16-dien-14-ol

This week was the greatest challenge yet as I made myself familiar with new technology and got stuck A LOT!! However, it was joyous to see myself progress. Excited for the next tasks!

Marking Week - 2 complete!

Hi @joiboi08 many congratulations on making it this far. Good job on learning more about working with docker, thank you @leilayesufu and @PromiseFru for all the help here.

As for the network connection we faced earlier, the issue seemed to have resolved on its own after a couple of days, and I can work with Ersilia normally again without VPN. (As guessed, it was probably a geographical outage)

joiboi08 commented 1 year ago

I'm having fun learning new things! @DhanshreeA And thank you for the update! It is a relief that it is not something permanent ☺️

joiboi08 commented 1 year ago

Running The Model

To run this model, I followed the instructions mentioned in their repository.

pip3 install torch torchvision torchaudio

pip3 install torch torchvision torchaudio

This command was generated by the pytroch website according to the options you need.

gh repo clone ACE-KAIST/PIGNet2
conda create -n pignet2 python=3.9
conda activate pignet2
pip install -r requirements.txt
cd PIGNet2/dataset
bash download.sh
bash untar.sh

I have been facing some hardware problems with its implementation in that it eats up all available space in my C drive and sometimes uses up all the RAM causing other applications to fail. I am looking into an alternative implementation that can possibly be better than running it locally.

joiboi08 commented 1 year ago

Second Model Proposition

ChemProp - A Message Passing Neural Network for Molecular Property Prediction and its Application in A Deep Learning Approach to Antibiotic Discovery

My interpretation of the model :

Relevance to Ersilia

joiboi08 commented 1 year ago

Third Model Proposition

AMPlify: attentive deep learning model for discovery of novel antimicrobial peptides effective against WHO priority pathogens

My interpretation of the model :

Relevance to Ersilia

joiboi08 commented 1 year ago

Just updating that I have submitted the final application through Outreachy along with a timeline on 27 October, 2023!

GemmaTuron commented 1 year ago

Hello,

Thanks for your work during the Outreachy contribution period, we hope you enjoyed it! We will now close this issue while we work on the selection of interns. Thanks again!