Closed Richiio closed 11 months ago
Metadata.json update without model included
Successfully done Re-installed Ersilia Successfully tested the main code using the test data given in the source code.
Where I am at currently Description fails during the workflow. Hint given(should be longer than 200 characters and different from the title) which I verified and it is longer than 200 characters and not the title (More information on this would be appreciated). Since this fails, it can't proceed to checking the output and log files for the main model.
Attempts but not confirmed Attempted but can't yet confirm due to failing metadata.json descripton workflow Added the trained models from the source code into the checkpoints folder Edited the main.py file to make predictions on an input data on the trained model and return the output as a csv file Edited the dockerfile to download the necessary dependencies to make my main.py file run, this included asking it to install torch and pickle
Where I need clarification From the source code, we have both classification and regression tasks, when testing the models, what the code I wrote does is it runs predictions for all saved .pk files it finds in the checkpoint directory but I have a feeling we need to specify if it is a classification or regression task.
@Richiio, I have checked the metadata file, and indeed there is an issue with the description. Instead of a string, it is a list. So when checking for the length of the description, it will return the length of the list, instead of the length of the strings.
Since the description is generated automatically when the model repo is created, we will update this from ersilia side. For now, change the description from a list to a string. From
"Description": [
"Prediction of antimicrobial potential using a dataset of 29537 compounds screened against the antibiotic resistant pathogen Burkholderia cenocepacia. The model uses the Chemprop Direct Message Passing Neural Network (D-MPNN) abd has an AUC score of 0.823 for the test set. It has been used to virtually screen the FDA approved drugs as well as a collection of natural product list (>200k compounds)",
"with hit rates of 26% and 12% respectively."
]
To
"Description":
"Prediction of antimicrobial potential using a dataset of 29537 compounds screened against the antibiotic resistant pathogen Burkholderia cenocepacia. The model uses the Chemprop Direct Message Passing Neural Network (D-MPNN) abd has an AUC score of 0.823 for the test set. It has been used to virtually screen the FDA approved drugs as well as a collection of natural product list (>200k compounds), with hit rates of 26% and 12% respectively."
Ahhh yes, that's true. Thanks for the feedback @HellenNamulinda
Initial Metadata.json update