Closed paulinebanye closed 1 year ago
@GemmaTuron @miquelduranfrigola
I was finally able to create the environment but I had issues running the model, the code python app.py
returned _pickle.UnpicklingError: invalid load key, '<'. error
.
After spending hours debugging,
git clone --recursive -b development https://github.com/ncats/ncats-adme.git
python app.py
which began the download of the repos.@GemmaTuron @miquelduranfrigola please review
I was able to successfully predict the model using the provided csv file and the eml canonical file.
git clone --recursive -b development https://github.com/ncats/ncats-adme.git
python app.py
Hi Pauline, this is great news thanks.
The authors provide a small FLASK application to serve the models. We now need to take the models one by one and try to use them outside their application. This means making a simple version of the app.py that does not use flask, but simply loads the data, calls the model and gets the prediction printed on the screen. It might seem a lot but actually all the functions we need are in the file already, we just need to simplify it --> to use the service.py file we use in Ersilia In our case, we will only do 1 model = 1 repository instead of the 5 models in the repository
So, Id' say:
In addition, I have a question regarding the output of the models: I see in the app.py they calculate some sort of similarity (line 280: # for all models except cyp450, calculate the nearest neigbors and add additional column to response_df) Do you know if this is happening or not? I cannot see it in the output
Hi Pauline, this is great news thanks.
The authors provide a small FLASK application to serve the models. We now need to take the models one by one and try to use them outside their application. This means making a simple version of the app.py that does not use flask, but simply loads the data, calls the model and gets the prediction printed on the screen. It might seem a lot but actually all the functions we need are in the file already, we just need to simplify it --> to use the service.py file we use in Ersilia In our case, we will only do 1 model = 1 repository instead of the 5 models in the repository
So, Id' say:
- Download the actual model checkpoints from the repo if you haven't already
- Activate the conda environment with all packages
- Try to run the model using the minimal code required (I think, in the RLM case, this will be the base, chemprop and rlm predictors)
In addition, I have a question regarding the output of the models: I see in the app.py they calculate some sort of similarity (line 280: # for all models except cyp450, calculate the nearest neigbors and add additional column to response_df) Do you know if this is happening or not? I cannot see it in the output
Thank you @GemmaTuron βΊοΈ I have isolated the code to run only the RLM and I have downloaded the model. I have never worked with flask before but I'm attempting to work on the functions now.
Regarding the nearest neighbor I am unclear if this is actually running, I don't see any evidence of this in the output files.
Thanks @pauline-banye
Maybe we can try another model to check if the nearest neighbors are bein predicted. For Flask, you do not need to worry about it, the idea is to remove flask altogether as we will use BentoML and Ersilia's environment, so maybe I'd suggest to create a new folder with the minimum code (probably from the /server folder: base, chemprop, features, utilities, rlm) and the rlm model and try to run a prediction
Thanks @pauline-banye
Maybe we can try another model to check if the nearest neighbors are bein predicted. For Flask, you do not need to worry about it, the idea is to remove flask altogether as we will use BentoML and Ersilia's environment, so maybe I'd suggest to create a new folder with the minimum code (probably from the /server folder: base, chemprop, features, utilities, rlm) and the rlm model and try to run a prediction
Hi @GemmaTuron thank you, I'm taking a look at it now.
We are continuing the discussion on #512
/approve
@pauline-banye ersilia model respository has been successfully created and is available at:
π ersilia-os/eos5505
Now that your new model respository has been created, you are ready to start contributing to it!
Here are some brief starter steps for contributing to your new model repository:
Note: Many of the bullet points below will have extra links if this is your first time contributing to a GitHub repository
README.md
file to accurately describe your modelIf you have any questions, please feel free to open an issue and get support from the community!
Hello @pauline-banye. I create a main.py file, based on the app.py file as an example, removing what I considered unnecessary to execute the code in the console, my idea is that you can read the file and the output of the predict function can be written to a file csv. main_example.txt I would like you to test if this can work in your installed Conda environment, I could not test it in mine, and when I tried to install the dependencies this generated conflicts between dependencies, I don't know if you got these conflict errors but it is for me failing when I try to install Keras and NumPy (I use Ubuntu 20.04 on windows).
I hope this file will serve as a guide and help to implement it. I understand that you will only use an "rlm" model so I modified the predict_df function (before it used a for to go through a list of models from a file that you passed in the application, now this would not be necessary, just pass the model and that's it, that model I understand that it is in the model/rlm folder). I also removed several unnecessary imports from the flask. When I tried to install each dependency in my conda environment I realized that some dependencies were to be able to run the application with flask. I recommend you take into account the dependencies that you are going to install for the configuration of the docker file.
This model is now working! I'll close the issue
Model Name
RLM Stability
Model Description
Prediction of hepatic metabolic stability is a key pharmacokinetic parameter in drug discovery. Hepatic metabolic stability can prevent a drug from attaining sufficient in vivo exposure, producing short half-lives, poor oral bioavailability and low plasma concentrations.
Slug
rlm-stability
Tags
metabolic,stability,adme,drugdiscovery
Publication
An Automated High-Throughput Metabolic Stability Assay Using an Integrated High-Resolution Accurate Mass Method and Automated Data Analysis Software
Retrospective assessment of rat liver microsomal stability at NCATS: data and QSAR models
Analyzing Learned Molecular Representations for Property Prediction
Code
https://github.com/ncats/ncats-adme
License
No response