Closed github-actions[bot] closed 1 year ago
Hi @GemmaTuron and @ZakiaYahya, The model works using Colab, CLI and Docker. Though somewhat slow. eos4q1a_colab_output.csv eos4q1a_cli_output.csv eos4q1a_docker_output.csv
(ersilia) hellenah@hellenah-elitebook:~$ ersilia serve eos4q1a
🚀 Serving model eos4q1a: crem-structure-generation
URL: http://0.0.0.0:46339
PID: -1
SRV: pulled_docker
👉 To run model:
- run
💁 Information:
- info
(ersilia) hellenah@hellenah-elitebook:~$ ersilia -v run -i ~/test.csv -o eos4q1a_cli_output.csv > eos4q1a_predict.log 2>&1
(ersilia) hellenah@hellenah-elitebook:~$ ersilia run -i "FC(F)Oc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"
{
"input": {
"key": "MLBNXJTXHVBPEC-UHFFFAOYSA-N",
"input": "FC(F)Oc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1",
"text": "FC(F)Oc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"
},
"output": {
"outcome": [
"NS(=O)(=O)Cc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1",
"CCC(=O)c1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1",
"CCC(=O)Oc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1",
"O=C(COc1ccc2ccsc2c1)NS(=O)(=O)c1ccc(OC(F)F)cc1",
"CCON=Cc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1",
...
But make me understand, is this the flexible list type? Like, the output is saved to a file as a JSON string, instead of individual columns
key,input,outcome
MLBNXJTXHVBPEC-UHFFFAOYSA-N,FC(F)Oc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1,"{""outcome"": [""CCc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"", ""O=C(COc1ccc2ccsc2c1)NS(=O)(=O)c1ccc(OC(F)F)cc1"", ""O=P(Nc1cccc(Cl)c1)(Oc1ccc(OC(F)F)cc1)Oc1ccc2ccsc2c1"", ""Oc1cc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)ccc1OC(F)F"", ""CCCOc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"", ""CONC(=O)Nc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"", ""CNS(=O)(=O)c1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"",
...
Hi @HellenNamulinda Thanks for checking. The output should be a .csv file since the lenght of the output is limited to 100 columns it should not be difficult to convert it. @ZakiaYahya can you have a look?
Hello @GemmaTuron Yes, the output is the .csv file and not a .json file. @HellenNamulinda Can you checked it again. Thanks
@ZakiaYahya, They're CSV files like the ones I attached. The concern was about the JSON string object (one column) ;
"{""outcome"": [""CCc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"", ""O=C(COc1ccc2ccsc2c1)NS(=O)(=O)c1ccc(OC(F)F)cc1"", ""O=P(Nc1cccc(Cl)c1)(Oc1ccc(OC(F)F)cc1)Oc1ccc2ccsc2c1"", ""Oc1cc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)ccc1OC(F)F"", ""CCCOc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"", ""CONC(=O)Nc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"", ""CNS(=O)(=O)c1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"",
...
It was confusing because I haven't worked on models with a flexible list as output. (Which is related to ersilia I guess) Thanks!
Yes @HellenNamulinda
As the model output a flexible list, hence it is the format that Ersilia's returned when handling with flexible list
. It is nothing with the code itself, As for your information i'm attaching a output file that this model returns when running it with run.sh
and you can see from that the model returns a single string in a single row/entry. But when the same model run with --repo-path
the format of the output is changed just because Ersilia's output.py
format it like that. For your review, i'm attaching both outputs here
output_repo_path.csv
output_run.sh.csv
If you want to look into more detail of this, see the discussion on this here Thanks
Thank you so much @ZakiaYahya, :clap: This is well understood.
Hi @ZakiaYahya Thanks for the detailed explanation. I think now that the model is fixed to 100 outputs we can simply change the metadata file to "List" as output and it will then work
Hello @GemmaTuron Okay, Should i change it in metadata and open the PR again??
I'll change it directly here, no worries. Let me discuss with @miquelduranfrigola if we want to change all the flexible lists to lists
@GemmaTuron would you still like me to test this now? Or should I wait for a new PR?
@pittmanriley I've updated the metadata now to a List output, so you can check that indeed it returns the outcome as we want
Hi @GemmaTuron, I've tested this model on all three platforms and it seems to work well. It didn't find all 100 similar molecules for each of the inputs, but I think this is to be expected sometimes. Here are the outputs:
CLI: eos4q1a_cli.csv Docker: eos4q1a_docker.log Colab: eos4q1a_colab.csv
yes, that is fine! thanks!
This model is ready for testing. If you are assigned to this issue, please try it out both on the CLI and Google Colab and let us know if it works!