ersilia-os / eos1bba

GNU General Public License v3.0
0 stars 1 forks source link

Model ready for testing #4

Closed GemmaTuron closed 1 year ago

GemmaTuron commented 1 year ago

Please test that this model is working both on the:

Pass a list of at least 10 smiles to it in csv format and paste the output here (you can get them from the eml_canonical.csv file in /notebooks)

masroor07 commented 1 year ago

After serving the model, I try to run predictions. I tried running prediction for a single smile first. ersilia api predict -i "Nc1nc(NC2CC2)c3ncn([C@@H]4C[C@H](CO)C=C4)c3n1" But everytime i run the prediction, i get this conncetion failed error! I have tried VPN, even tried switching networks. Nothing seems to help.

An Update

  "input": {
        "key": "NQQBNZBOOHHVQP-UHFFFAOYSA-N",
        "input": "C1=C(SC(=N1)SC2=NN=C(S2)N)[N+](=O)[O-]",
        "text": "C1=C(SC(=N1)SC2=NN=C(S2)N)[N+](=O)[O-]"
    },
    "output": {
        "outcome": [
            null
        ]
    }
}
{
    "input": {
        "key": "HEFNNWSXXWATRW-UHFFFAOYSA-N",
        "input": "CC(C)CC1=CC=C(C=C1)C(C)C(=O)O",
        "text": "CC(C)CC1=CC=C(C=C1)C(C)C(=O)O"
    },
    "output": {
        "outcome": [
            null
        ]

why is the outcome null?

carcablop commented 1 year ago

Hello @masroor07 You can run this '-v' command to see the detail or message flow when you run the command, but before making any predictions you must run: ersilia -v serve model_name: With this command you can see the API available for the model. (can be run, calculate or predict).

Have you served the model before? is the api really 'predict' for this model?. You can check it with the above command.

Then you run: ersilia -v api name_api -i "eml_canonical.csv" -o "output.csv". or ersilia -v api _name_api_ -i "Nc1nc(NC2CC2)c3ncn([C@@H]4C[C@H](CO)C=C4)c3n1" -o out.csv

To save the log to a my.log file like this: ersilia -v api name_api -i "eml_canonical.csv" -o "output.csv" > my.log 2>&1 And that file you can share.

Here you can see the detail and see what may be happening with the model to make the predictions. the example eml_canonical.csv is very large, you can from here create another file with 10 molecules and test it with that file.

With this you can see a little more detail of what may be happening with the model. I hope it helps.

masroor07 commented 1 year ago

Hello @masroor07 You can run this '-v' command to see the detail or message flow when you run the command, but before making any predictions you must run: ersilia -v serve model_name: With this command you can see the API available for the model. (can be run, calculate or predict).

Have you served the model before? is the api really 'predict' for this model?. You can check it with the above command.

Then you run: ersilia -v api name_api -i "eml_canonical.csv" -o "output.csv". or ersilia -v api _name_api_ -i "Nc1nc(NC2CC2)c3ncn([C@@H]4C[C@H](CO)C=C4)c3n1" -o out.csv

To save the log to a my.log file like this: ersilia -v api name_api -i "eml_canonical.csv" -o "output.csv" > my.log 2>&1 And that file you can share.

Here you can see the detail and see what may be happening with the model to make the predictions. the example eml_canonical.csv is very large, you can from here create another file with 10 molecules and test it with that file.

With this you can see a little more detail of what may be happening with the model. I hope it helps.

I see, alright! The api for this model is Run. Thank you for pointing that out. I will get on with it straight away.

masroor07 commented 1 year ago

Hello @masroor07 You can run this '-v' command to see the detail or message flow when you run the command, but before making any predictions you must run: ersilia -v serve model_name: With this command you can see the API available for the model. (can be run, calculate or predict). Have you served the model before? is the api really 'predict' for this model?. You can check it with the above command. Then you run: ersilia -v api name_api -i "eml_canonical.csv" -o "output.csv". or ersilia -v api _name_api_ -i "Nc1nc(NC2CC2)c3ncn([C@@H]4C[C@H](CO)C=C4)c3n1" -o out.csv To save the log to a my.log file like this: ersilia -v api name_api -i "eml_canonical.csv" -o "output.csv" > my.log 2>&1 And that file you can share. Here you can see the detail and see what may be happening with the model to make the predictions. the example eml_canonical.csv is very large, you can from here create another file with 10 molecules and test it with that file. With this you can see a little more detail of what may be happening with the model. I hope it helps.

I see, alright! The api for this model is Run. Thank you for pointing that out. I will get on with it straight away.

Was able to find the API available for this model. But connection issue remains there. I am trying to run the prediction on a single SMILE, but haven't been able to establish good network. Will keep you updated!

Thank you

ZakiaYahya commented 1 year ago

Hi @GemmaTuron, i have fetch and serve model eos1bba successfully both on CLI and Colab. But executing "run" api gives me error both on CLI and Colab when passing whole list of smiles. The error is "TypeError: 'NoneType' object is not subscriptable". Log file is attached here: eos1bba_log.log

This is the same error that i got when working with model eos526j, in this case the output is complex hierarchy that it need some json format to store the output instead of csv file. But this output returns "Null" instead of complex hierachy output. But when i pass first single string of smile it gave me an output both on CLI and Colab i.e. { "input": { "key": "MCGSCOLBFJQGHM-SCZZXKLOSA-N", "input": "Nc1nc(NC2CC2)c3ncn([C@@H]4CC@HC=C4)c3n1", "text": "Nc1nc(NC2CC2)c3ncn([C@@H]4CC@HC=C4)c3n1" }, "output": { "outcome": [ null ] } }

The same null output the @masroor07 gets. @masroor07 did you able to get correct output for whole list of strings??@carcablop any thoughts on this?

masroor07 commented 1 year ago

I was able to fetch the model using Google colab and run predictions as well. I used 9 SMILES stored in a CSV and ran predictions on them. I used the following CSV file as the input for my model:

10_SMILES.csv

And I stored the output in another csv file But this output returns "Null" rather than the right complex hierachy output. And to store the data into a CSV, I had to store the output in JSON.

masroor07 commented 1 year ago

Hi @GemmaTuron, i have fetch and serve model eos1bba successfully both on CLI and Colab. But executing "run" api gives me error both on CLI and Colab when passing whole list of smiles. The error is "TypeError: 'NoneType' object is not subscriptable". Log file is attached here: eos1bba_log.log

This is the same error that i got when working with model eos526j, in this case the output is complex hierarchy that it need some json format to store the output instead of csv file. But this output returns "Null" instead of complex hierachy output. But when i pass first single string of smile it gave me an output both on CLI and Colab i.e. { "input": { "key": "MCGSCOLBFJQGHM-SCZZXKLOSA-N", "input": "Nc1nc(NC2CC2)c3ncn([C@@h]4CC@HC=C4)c3n1", "text": "Nc1nc(NC2CC2)c3ncn([C@@h]4CC@HC=C4)c3n1" }, "output": { "outcome": [ null ] } }

The same null output the @masroor07 gets. @masroor07 did you able to get correct output for whole list of strings??@carcablop any thoughts on this?

No, but i was able to come through the error "TypeError: 'NoneType' object is not subscriptable". I stored the data into JSON first and later stored the JSON into a CSV. But the out remains the same for all the input:

"input": {
"key": "MCGSCOLBFJQGHM-SCZZXKLOSA-N",
"input": "Nc1nc(NC2CC2)c3ncn([C@@h]4C[C@H](https://github.com/ersilia-os/eos1bba/issues/CO)C=C4)c3n1",
"text": "Nc1nc(NC2CC2)c3ncn([C@@h]4C[C@H](https://github.com/ersilia-os/eos1bba/issues/CO)C=C4)c3n1"
},
"output": {
"outcome": [
null
]
}
}
masroor07 commented 1 year ago

Hi @ZakiaYahya, I am still not able to get the right output for the predictions. Have you seen some progress?

ZakiaYahya commented 1 year ago

Hi @ZakiaYahya, I am still not able to get the right output for the predictions. Have you seen some progress?

No not yet, tried it many times but still it's giving null output. I think karthikjetty working on it, as he said that he is debugging it, may be that's the reason it's returing null output. because i have tried it passing various smiles strings but the output was same for all. @karthikjetty are you done debugging model eos1bba?? it still not giving proper output.

masroor07 commented 1 year ago

Hi @ZakiaYahya, I am still not able to get the right output for the predictions. Have you seen some progress?

No not yet, tried it many times but still it's giving null output. I think karthikjetty working on it, as he said that he is debugging it, may be that's the reason it's returing null output. because i have tried it passing various smiles strings but the output was same for all. @karthikjetty are you done debugging model eos1bba?? it still not giving proper output.

Alright! Thanks for the update. I tried fetching the model using the CLI from scratch again. Haven't been able to fetch it again so far alright!

ZakiaYahya commented 1 year ago

Hi @ZakiaYahya, I am still not able to get the right output for the predictions. Have you seen some progress?

No not yet, tried it many times but still it's giving null output. I think karthikjetty working on it, as he said that he is debugging it, may be that's the reason it's returing null output. because i have tried it passing various smiles strings but the output was same for all. @karthikjetty are you done debugging model eos1bba?? it still not giving proper output.

Alright! Thanks for the update. I tried fetching the model using the CLI from scratch again. Haven't been able to fetch it again so far alright!

Oh right. in my case fetch and serve works perfectly, i just did it. But 'run' api is still giving me null output.

masroor07 commented 1 year ago

Model Testing results:

Output:

{
    "input": {
        "key": "BSYNRYMUTXBXSQ-UHFFFAOYSA-N",
        "input": "CC(=O)Oc1ccccc1C(=O)O",
        "text": "CC(=O)Oc1ccccc1C(=O)O"
    },
    "output": {
        "outcome": [
            null
        ]
    }
}

Current Behavior: As can be seen above, the model returns a null value for a SMILE input.

Expected Behavior: This model uses regression to determine the outputs and the output should be a decimal value.

GemmaTuron commented 1 year ago

Hi @karthikjetty, while the models seems to be able to fetch, the output of the prediction is not returned, it is probably due to a small conversion error from model output to final format. Please can you have a look and fix it? Thanks

ZakiaYahya commented 1 year ago

Hello @GemmaTuron and @karthikjetty, the model is giving the numeric output now without any problem, here's the output for single smile, { "input": { "key": "MCGSCOLBFJQGHM-SCZZXKLOSA-N", "input": "Nc1nc(NC2CC2)c3ncn([C@@H]4CC@HC=C4)c3n1", "text": "Nc1nc(NC2CC2)c3ncn([C@@H]4CC@HC=C4)c3n1" }, "output": { "outcome": [ -0.168225 ] } }

GemmaTuron commented 1 year ago

Hi @ZakiaYahya !

Thanks for testing this! @karthikjetty can you explain a bit more the outcome? I thought the model was giving several tox21 results, but I only see one value?

masroor07 commented 1 year ago

Hi, @GemmaTuron @karthikjetty! I was finally able to fetch and serve the model right and the model seems to be running alright this time around.

I tried the model on the CLI with a couple of input SMILES:

  1. Nc1nc(NC2CC2)c2ncn([C@H]3C=C[C@@H](CO)C3)c2n1 OUTPUT:
    {
    "input": {
        "key": "MCGSCOLBFJQGHM-SCZZXKLOSA-N",
        "input": "Nc1nc(NC2CC2)c2ncn([C@H]3C=C[C@@H](CO)C3)c2n1",
        "text": "Nc1nc(NC2CC2)c2ncn([C@H]3C=C[C@@H](CO)C3)c2n1"
    },
    "output": {
        "outcome": [
            0.615656
        ]
    }
    }
  2. CCCSc1ccc2nc(NC(=O)OC)[nH]c2c1 OUTPUT:
    {
    "input": {
        "key": "HXHWSAZORRCQMX-UHFFFAOYSA-N",
        "input": "CCCSc1ccc2nc(NC(=O)OC)[nH]c2c1",
        "text": "CCCSc1ccc2nc(NC(=O)OC)[nH]c2c1"
    },
    "output": {
        "outcome": [
            -0.398067
        ]
    }
    }
GemmaTuron commented 1 year ago

Thanks @masroor07 !

We need to check the model because it is only outputting one result instead of a list of results (for the whole Tox21 array) - I'll work with @karthikjetty on this.

karthikjetty commented 1 year ago

Hi! Yes, the output needs to be a list of float values. Currently, in the main.py function, I think the problem has to do with the following code snippet:

   for smiles in smiles_list:
        graph1, graph2 = collate_fn([transform_fn({'smiles': smiles})])
        preds = model(graph1.tensor(), graph2.tensor()).numpy()[0]
        for name, prob in zip(task_names, preds):
            output.append("%f" % (prob))
    return output

Essentially, for each smile, there are 12 predictors the model can do. The output for the my_model function (the main–code which is described above) right now is a single list of values, each SMILE outputting 12 rows of values (single column).

I have tried modifying the output to be a a list of lists where a single row in the output is a list and has all 12 values for a SMILE. The code for this is below.

    for smiles in smiles_list:
        tasks_list = []
        graph1, graph2 = collate_fn([transform_fn({'smiles': smiles})])
        preds = model(graph1.tensor(), graph2.tensor()).numpy()[0]
        for name, prob in zip(task_names, preds):
            tasks_list.append("%f" % (prob))
        output.append(tasks_list)
    return output

Both of these still result in a singular output when running in Ersilia, which confuses me since the output for both of these functions is written in a different format.

I guess I'm not sure exactly what format I should leave the output in so that the result is a list of values. I have also tried modifying the output specified Metadata.json file, but this doesn't seem to change anything.

I also notice when running eos1bba in verbose mode, the following line:

19:14:58 | DEBUG    | Info {'name': 'eos1bba', 'version': '20230315191451_50B8FB', 'created_at': '2023-03-15T19:14:52.067327Z', 'env': {'conda_env': 'name: bentoml-default-conda-env\nchannels:\n- defaults\ndependencies: []\n', 'python_version': '3.7.16', 'docker_base_image': 'bentoml/model-server:0.11.0-py37', 'pip_packages': ['bentoml==0.11.0']}, 'artifacts': [{'name': 'model', 'artifact_type': 'Artifact', 'metadata': {}}], 'apis': [{'name': 'run', 'input_type': 'JsonInput', 'docs': "BentoService inference API 'run', input: 'JsonInput', output: 'DefaultOutput'", 'output_config': {'cors': '*'}, 'output_type': 'DefaultOutput', 'mb_max_latency': 10000, 'mb_max_batch_size': 2000, 'batch': True}]}

Here, it says the output and output_type are both "DefaultOutput" regardless of how I modify the metadata.json file. I'm not sure how to change this, but this could possibly be a solution to changing the output.

masroor07 commented 1 year ago

Thanks @masroor07 !

We need to check the model because it is only outputting one result instead of a list of results (for the whole Tox21 array) - I'll work with @karthikjetty on this.

Noted! glad i could be of any help. Thank you for the opportunity