ersilia-os / eos4q1a

GNU General Public License v3.0
0 stars 1 forks source link

New model ready for testing! #10

Closed github-actions[bot] closed 1 year ago

github-actions[bot] commented 1 year ago

This model is ready for testing. If you are assigned to this issue, please try it out both on the CLI and Google Colab and let us know if it works!

HellenNamulinda commented 1 year ago

Hi @GemmaTuron and @ZakiaYahya, The model works using Colab, CLI and Docker. Though somewhat slow. eos4q1a_colab_output.csv eos4q1a_cli_output.csv eos4q1a_docker_output.csv

(ersilia) hellenah@hellenah-elitebook:~$ ersilia serve eos4q1a
🚀 Serving model eos4q1a: crem-structure-generation

   URL: http://0.0.0.0:46339
   PID: -1
   SRV: pulled_docker

👉 To run model:
   - run

💁 Information:
   - info
(ersilia) hellenah@hellenah-elitebook:~$ ersilia -v run -i ~/test.csv -o eos4q1a_cli_output.csv > eos4q1a_predict.log 2>&1
(ersilia) hellenah@hellenah-elitebook:~$ ersilia run -i "FC(F)Oc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"
{
    "input": {
        "key": "MLBNXJTXHVBPEC-UHFFFAOYSA-N",
        "input": "FC(F)Oc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1",
        "text": "FC(F)Oc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"
    },
    "output": {
        "outcome": [
            "NS(=O)(=O)Cc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1",
            "CCC(=O)c1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1",
            "CCC(=O)Oc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1",
            "O=C(COc1ccc2ccsc2c1)NS(=O)(=O)c1ccc(OC(F)F)cc1",
            "CCON=Cc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1", 
...

But make me understand, is this the flexible list type? Like, the output is saved to a file as a JSON string, instead of individual columns

key,input,outcome
MLBNXJTXHVBPEC-UHFFFAOYSA-N,FC(F)Oc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1,"{""outcome"": [""CCc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"", ""O=C(COc1ccc2ccsc2c1)NS(=O)(=O)c1ccc(OC(F)F)cc1"", ""O=P(Nc1cccc(Cl)c1)(Oc1ccc(OC(F)F)cc1)Oc1ccc2ccsc2c1"", ""Oc1cc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)ccc1OC(F)F"", ""CCCOc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"", ""CONC(=O)Nc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"", ""CNS(=O)(=O)c1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"",
...
GemmaTuron commented 1 year ago

Hi @HellenNamulinda Thanks for checking. The output should be a .csv file since the lenght of the output is limited to 100 columns it should not be difficult to convert it. @ZakiaYahya can you have a look?

ZakiaYahya commented 1 year ago

Hello @GemmaTuron Yes, the output is the .csv file and not a .json file. @HellenNamulinda Can you checked it again. Thanks

HellenNamulinda commented 1 year ago

@ZakiaYahya, They're CSV files like the ones I attached. The concern was about the JSON string object (one column) ;

"{""outcome"": [""CCc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"", ""O=C(COc1ccc2ccsc2c1)NS(=O)(=O)c1ccc(OC(F)F)cc1"", ""O=P(Nc1cccc(Cl)c1)(Oc1ccc(OC(F)F)cc1)Oc1ccc2ccsc2c1"", ""Oc1cc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)ccc1OC(F)F"", ""CCCOc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"", ""CONC(=O)Nc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"", ""CNS(=O)(=O)c1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"",
...

It was confusing because I haven't worked on models with a flexible list as output. (Which is related to ersilia I guess) Thanks!

ZakiaYahya commented 1 year ago

Yes @HellenNamulinda As the model output a flexible list, hence it is the format that Ersilia's returned when handling with flexible list. It is nothing with the code itself, As for your information i'm attaching a output file that this model returns when running it with run.sh and you can see from that the model returns a single string in a single row/entry. But when the same model run with --repo-paththe format of the output is changed just because Ersilia's output.py format it like that. For your review, i'm attaching both outputs here output_repo_path.csv output_run.sh.csv

If you want to look into more detail of this, see the discussion on this here Thanks

HellenNamulinda commented 1 year ago

Thank you so much @ZakiaYahya, :clap: This is well understood.

GemmaTuron commented 1 year ago

Hi @ZakiaYahya Thanks for the detailed explanation. I think now that the model is fixed to 100 outputs we can simply change the metadata file to "List" as output and it will then work

ZakiaYahya commented 1 year ago

Hello @GemmaTuron Okay, Should i change it in metadata and open the PR again??

GemmaTuron commented 1 year ago

I'll change it directly here, no worries. Let me discuss with @miquelduranfrigola if we want to change all the flexible lists to lists

pittmanriley commented 1 year ago

@GemmaTuron would you still like me to test this now? Or should I wait for a new PR?

GemmaTuron commented 1 year ago

@pittmanriley I've updated the metadata now to a List output, so you can check that indeed it returns the outcome as we want

pittmanriley commented 1 year ago

Hi @GemmaTuron, I've tested this model on all three platforms and it seems to work well. It didn't find all 100 similar molecules for each of the inputs, but I think this is to be expected sometimes. Here are the outputs:

CLI: eos4q1a_cli.csv Docker: eos4q1a_docker.log Colab: eos4q1a_colab.csv

GemmaTuron commented 1 year ago

yes, that is fine! thanks!