Closed github-actions[bot] closed 1 year ago
@GemmaTuron , The model is being fetched successfully in both colab and CLI but the fetch time was very long in CLI. colab_eos7a45_output.log eos7a45_fetch_log.log
However, some some input smiles have outputs while others cause errors(status of 500). The failing smiles are forcing ersilia to repeat the prediction and this repeats for a long time as can be seen in this log files. This happens in CLI and Colab predictions. colab_eos7a45_prediction.log eos7a45_predict.log
I see, that is helpful, thanks @samuelmaina ! The model is in Docker as well, can you try it out? @emmakodes can you also try this?
Thanks!
@GemmaTuron , the model is quite large and requires a lot of internet bundles. if the internet fails,the download starts a new. I am looking for a stable internet source and will test it. I am researching ways to reduce the amd64 build.
Hi @samuelmaina !
ok, let me know if you succeed otherwise we'll find a workaround
@GemmaTuron, There are some smiles in which the model is throwing errors, @HellenNamulinda , can you look at this logs?
@GemmaTuron , The model is being fetched successfully in both colab and CLI but the fetch time was very long in CLI. colab_eos7a45_output.log eos7a45_fetch_log.log
However, some some input smiles have outputs while others cause errors(status of 500). The failing smiles are forcing ersilia to repeat the prediction and this repeats for a long time as can be seen in this log files. This happens in CLI and Colab predictions. colab_eos7a45_prediction.log eos7a45_predict.log
But the model runs fully without crashing? what is the output for the smiles that do not work?
please check with a 10 smiles file that contains one wrong
But the model runs fully without crashing? what is the output for the smiles that do not work?
@GemmaTuron , The model did not finish predicting . I waited very long for the prediction to finish but it would show this log
23:38:29 | ERROR | Status Code: 500
23:38:29 | WARNING | Batch prediction didn't seem to work. Doing predictions one by one...
after every 15 minutes ( as can be seen in the prediction logs). It kept on trying the one by one processing with the same logs after every "Doing predictions one by one" iteration , so I though it was something like an infinite loop with ersilia.
@samuelmaina and @GemmaTuron , Yes, it is true that the model is slow at making predictions. And I did say that the model size itself is very big. Probably why it doesn't accept batch size of 100 which is specified by ersilia. also, for an inference that fails on first try, it is repeated. Usually for the eml dataset(442 records), the model takes approx 1hr to make predictions.
So, this is the more reason to document some of these model statistics just like @miquelduranfrigola suggested.
But the model runs fully without crashing? what is the output for the smiles that do not work?
@GemmaTuron , The model did not finish predicting . I waited very long for the prediction to finish but it would show this log
23:38:29 | ERROR | Status Code: 500 23:38:29 | WARNING | Batch prediction didn't seem to work. Doing predictions one by one...
after every 15 minutes ( as can be seen in the prediction logs). It kept on trying the one by one processing with the same logs after every "Doing predictions one by one" iteration , so I though it was something like an infinite loop with ersilia.
@samuelmaina,
If you were using the eml dataset, remember it has 442 records, which means five batches. So, Batch prediction didn't seem to work.
will happen five times. But it will do predictions one by one and complete all. just that it will take long. appx 1hr.
Try with a file with fewer molecules(like 10) as suggested by @GemmaTuron. Just like I said here, 5 records took appx 3minutes.
Thanks very much @HellenNamulinda for the detailed explantion. I didn't know ersilia does prediction of 100 per batch. I also thought a status of 500 meant something went wrong with the model.
Hi @samuelmaina
Remember you can also look for the specific sentence in the Ersilia code and read more about it. In this case, if you use VSCode to search for "Batch prediction didn't..." you'll see the functions that manage this.
I have been trying to run this model all-day
Status code: 200
Status Code: 500
Working on output: /tmp/ersilia-opn9dm2r/todo_output-chunk-3.json
Batch prediction didn't seem to work. Doing predictions one by one...
at intervals while making predictions though.
I think what that means is the model tries to make a batch prediction but when it fails hence Batch prediction didn't seem to work. Doing predictions one by one...
then the model goes to doing prediction for each smile one by one. When the prediction for a smile is successful, you have this Status code: 200
message but if prediction is unsuccessful, you get Status Code: 500
. Just as @HellenNamulinda said, since the model is now doing batch prediction, after a particular batch prediction, the model tries to produce the output hence you have Working on output: /tmp/ersilia-opn9dm2r/todo_output-chunk-3.json
then at the end the model tries to merge all batch prediction output to a singe file hence you have Merging 5 files into /tmp/ersilia-lqbgsjau/todo_output.json
at the end.# ersilia -v api run -i "cccc"
10:38:47 | DEBUG | Getting session from /root/eos/session.json
10:38:47 | DEBUG | Getting session from /root/eos/session.json
10:38:47 | WARNING | Lake manager 'isaura' is not installed! We strongly recommend installing it to store calculations persistently
10:38:47 | ERROR | Isaura is not installed! Calculations will be done without storing and reading from the lake, unfortunately.
10:38:49 | DEBUG | Is fetched: True
10:38:49 | DEBUG | Schema available in /root/eos/dest/eos7a45/api_schema.json
10:38:49 | DEBUG | Setting AutoService for eos7a45
10:38:49 | INFO | Service class provided
10:38:56 | DEBUG | No file splitting necessary!
10:38:57 | DEBUG | Reading card from eos7a45
10:38:57 | DEBUG | Reading shape from eos7a45
10:38:57 | DEBUG | Input Shape: Single
10:38:57 | DEBUG | Input type is: compound
10:38:57 | DEBUG | Input shape is: Single
10:38:57 | DEBUG | Importing module: .types.compound
10:38:57 | DEBUG | Checking RDKIT and other requirements necessary for compound inputs
10:39:31 | DEBUG | InputShapeSingle shape: Single
10:39:32 | DEBUG | API eos7a45:run initialized at URL http://127.0.0.1:3000
10:39:32 | DEBUG | Schema available in /root/eos/dest/eos7a45/api_schema.json
10:39:32 | DEBUG | No file splitting necessary!
10:39:32 | DEBUG | Reading card from eos7a45
10:39:32 | DEBUG | Reading shape from eos7a45
10:39:32 | DEBUG | Input Shape: Single
10:39:32 | DEBUG | Input type is: compound
10:39:32 | DEBUG | Input shape is: Single
10:39:32 | DEBUG | Importing module: .types.compound
10:39:32 | DEBUG | Checking RDKIT and other requirements necessary for compound inputs
10:39:32 | DEBUG | InputShapeSingle shape: Single
10:39:32 | DEBUG | API eos7a45:run initialized at URL http://127.0.0.1:3000
10:39:32 | DEBUG | Schema available in /root/eos/dest/eos7a45/api_schema.json
10:39:32 | DEBUG | Posting to run
10:39:32 | DEBUG | Batch size 100
10:39:35 | DEBUG | Schema available in /root/eos/dest/eos7a45/api_schema.json
10:47:39 | DEBUG | Status code: 200
10:47:40 | DEBUG | Schema available in /root/eos/dest/eos7a45/api_schema.json
10:47:41 | DEBUG | Done with unique posting
{
"input": {
"key": "KAKZBPTYRLMSJV-UHFFFAOYSA-N",
"input": "C=CC=C",
"text": "C=CC=C"
},
"output": {
"value": [
20.95406723022461
]
}
}
For google colab, my code is still running, once its done I will post the result immediately
I was able to pull the Docker image and make predictions using the eml_canonical data eos7a45_doc_predict.csv The log are : eos7a45_docker.log
I will rerun the CLI and Colab.
This model is ready for testing. If you are assigned to this issue, please try it out using the CLI, Google Colab and DockerHub and let us know if it works!