ersilia-os / eos8h6g

Avalon fingerprints
MIT License
1 stars 1 forks source link

New model ready for testing! #1

Closed github-actions[bot] closed 1 year ago

github-actions[bot] commented 1 year ago

This model is ready for testing. If you are assigned to this issue, please try it out using the CLI, Google Colab and DockerHub and let us know if it works!

samuelmaina commented 1 year ago

HI @GemmaTuron and @simrantan
Model fetches success with fetch test smiles having output vals, however the model seems to work with input files but fail to predict for single smiles. It is not even showing the the key and the text vals of the input on single input. The error of singles is consistent in cli, colab and docker. Full null logs : eos8h6g_1_smiles_pred.log

{
    "input": {
        "key": null,
        "input": null,
        "text": null
    },
    "output": {
        "fp": [
            null,
            null,
~samuelmayna : ~/ersilia_github_models/eos8h6g (main) $ docker run -v /home/samuelmayna:/data  ersiliaos/eos8h6g
+ [ -z eos8h6g ]
+ ersilia serve -p 3000 eos8h6g
🚀 Serving model eos8h6g: avalon

   URL: http://127.0.0.1:3000
   PID: 35
   SRV: conda

👉 To run model:
   - run

💁 Information:
   - info
Serving model eos8h6g...
+ echo Serving model eos8h6g...
+ nginx -g daemon off;
Error: No such option: -i
root@ad4b63270011:/data# ersilia run  -i eos8h6g/eml_canonical.csv -o eos8h6g_docker_emL_pred.csv
eos8h6g_docker_emL_pred.csv
root@ad4b63270011:/data# 
emmakodes commented 1 year ago

Hello @simrantan @GemmaTuron this model doesn't have a model folder, I guess It's meant to be so since it is calculating the Bitvector representation of a molecule right?

CLI

Model fetch and makes prediction successfully on CLI eos8h6g_cli_fetch_log.txt eos8h6g_output.csv

CLI For single smile prediction

I get an output for a single smile: eos8h6g_single_smile_pred.txt

Colab

It fetches and makes predictions successfully on Colab eos8h6g_output.csv

but I get the following warning when trying to make predictions on Colab. It might be because of how Pandas is used on the Colab notebook to save the output predictions.

/usr/local/lib/python3.7/site-packages/ersilia/core/model.py:242: PerformanceWarning: DataFrame is highly fragmented.  This is usually the result of calling `frame.insert` many times, which has poor performance.  Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()`
  df[data.features] = data.values

colab notebook

Docker

Not done yet

GemmaTuron commented 1 year ago

@emmakodes could you try the exact same molecule that @samuelmaina tried for single input?

Samuel, I see this: { "input": { "key": null, "input": null, "text": null },

So ti appears it is not able to take the input? Is this the command you used? ersilia run -i eos8h6g/eml_canonical.csv -o eos8h6g_docker_emL_pred.csv It could be there is too many spaced between run and -i ?

emmakodes commented 1 year ago

Sure @GemmaTuron @samuelmaina can you provide the molecule you predicted that gave you null so I can test same from my end.

samuelmaina commented 1 year ago

@GemmaTuron and @emmakodes , I selected a random smiles since I saw all the eml dataset smiles had output vals.I will try again and see if I get the same error .

samuelmaina commented 1 year ago

I updated ersilia yesterday and I came to test today I am getting this new error when testing with the Nc1nc(NC2CC2)c3ncn([C@@H]4C[C@H](CO)C=C4)c3n1,Nc1nc(NC2CC2)c2ncn([C@H]3C=C[C@@H](CO)C3)c2n1 smiles . I investigated a little and the temp file that is being generated has only one row having vals 0,, "", "", "" which results in None input parsing , The input object that is being passed from one function to another has key and input vals as Null but the text contains the input text. @miquelduranfrigola , I think a change is affecting the construction of the inp object for single smiles..

(ersilia) samuelmayna@SAM:~/ersilia_github_models$ ersilia run -i "Nc1nc(NC2CC2)c3ncn([C@@H]4C[C@H](CO)C=C4)c3n1,Nc1nc(NC2CC2)c2ncn([C@H]3C=C[C@@H](CO)C3)c2n1"
Traceback (most recent call last):
  File "/home/samuelmayna/miniconda3/envs/ersilia/bin/ersilia", line 33, in <module>
    sys.exit(load_entry_point('ersilia', 'console_scripts', 'ersilia')())
  File "/home/samuelmayna/miniconda3/envs/ersilia/lib/python3.7/site-packages/click/core.py", line 1130, in __call__
    return self.main(*args, **kwargs)
  File "/home/samuelmayna/miniconda3/envs/ersilia/lib/python3.7/site-packages/click/core.py", line 1055, in main
    rv = self.invoke(ctx)
  File "/home/samuelmayna/miniconda3/envs/ersilia/lib/python3.7/site-packages/click/core.py", line 1657, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/home/samuelmayna/miniconda3/envs/ersilia/lib/python3.7/site-packages/click/core.py", line 1404, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/samuelmayna/miniconda3/envs/ersilia/lib/python3.7/site-packages/click/core.py", line 760, in invoke
    return __callback(*args, **kwargs)
  File "/home/samuelmayna/miniconda3/envs/ersilia/lib/python3.7/site-packages/bentoml/cli/click_utils.py", line 138, in wrapper
    return func(*args, **kwargs)
  File "/home/samuelmayna/miniconda3/envs/ersilia/lib/python3.7/site-packages/bentoml/cli/click_utils.py", line 115, in wrapper
    return_value = func(*args, **kwargs)
  File "/home/samuelmayna/miniconda3/envs/ersilia/lib/python3.7/site-packages/bentoml/cli/click_utils.py", line 99, in wrapper
    return func(*args, **kwargs)
  File "/home/samuelmayna/ersilia/ersilia/cli/commands/run.py", line 36, in run
    for result in mdl.run(input=input, output=output, batch_size=batch_size):
  File "/home/samuelmayna/ersilia/ersilia/core/model.py", line 194, in _api_runner_iter
    for result in api.post(input=input, output=output, batch_size=batch_size):
  File "/home/samuelmayna/ersilia/ersilia/serve/api.py", line 321, in post
    input=unique_input, output=None, batch_size=batch_size
  File "/home/samuelmayna/ersilia/ersilia/serve/api.py", line 297, in post_unique_input
    for res in self.post_amenable_to_h5(input, output, batch_size):
  File "/home/samuelmayna/ersilia/ersilia/serve/api.py", line 250, in post_amenable_to_h5
    input=todo_input, output=todo_output, batch_size=batch_size
  File "/home/samuelmayna/ersilia/ersilia/serve/api.py", line 126, in post_only_calculations
    for input in self.input_adapter.adapt(input, batch_size=batch_size):
  File "/home/samuelmayna/ersilia/ersilia/io/input.py", line 167, in adapt
    data = self.adapter.adapt(inp)
  File "/home/samuelmayna/ersilia/ersilia/io/input.py", line 144, in adapt
    data = self._adapt(inp)
  File "/home/samuelmayna/ersilia/ersilia/io/input.py", line 138, in _adapt
    return self._file_reader(inp)
  File "/home/samuelmayna/ersilia/ersilia/io/input.py", line 130, in _file_reader
    reader = TabularFileReader(path=inp, IO=self.IO)
  File "/home/samuelmayna/ersilia/ersilia/io/readers/file.py", line 545, in __init__
    self._standardize()
  File "/home/samuelmayna/ersilia/ersilia/io/readers/file.py", line 554, in _standardize
    sniff_line_limit=self.sniff_line_limit,
  File "/home/samuelmayna/ersilia/ersilia/io/readers/file.py", line 385, in __init__
    self.read_input_columns()
  File "/home/samuelmayna/ersilia/ersilia/io/readers/file.py", line 296, in read_input_columns
    assert input is not None
AssertionError
emmakodes commented 1 year ago

Hello @samuelmaina try this: ersilia -v run -i "['Nc1nc(NC2CC2)c3ncn([C@@H]4C[C@H](CO)C=C4)c3n1','Nc1nc(NC2CC2)c2ncn([C@H]3C=C[C@@H](CO)C3)c2n1']"

samuelmaina commented 1 year ago

Hello @samuelmaina try this: ersilia -v run -i "['Nc1nc(NC2CC2)c3ncn([C@@H]4C[C@H](CO)C=C4)c3n1','Nc1nc(NC2CC2)c2ncn([C@H]3C=C[C@@H](CO)C3)c2n1']"

Thanks @emmakodes ,this runs successfully and produces output vals. I think this issue need to be investigated further.

GemmaTuron commented 1 year ago

@samuelmaina

You are passing TWO SMILES as one: "Nc1nc(NC2CC2)c3ncn([C@@H]4C[C@H](CO)C=C4)c3n1,Nc1nc(NC2CC2)c2ncn([C@H]3C=C[C@@H](CO)C3)c2n1"

So there is not anything to add, when is passed as a LIST it works. You can test with a single smiles: "Nc1nc(NC2CC2)c3ncn([C@@H]4C[C@H](CO)C=C4)c3n1" or a LIST "['Nc1nc(NC2CC2)c3ncn([C@@H]4C[C@H](CO)C=C4)c3n1','Nc1nc(NC2CC2)c2ncn([C@H]3C=C[C@@H](CO)C3)c2n1']"

samuelmaina commented 1 year ago

@samuelmaina

You are passing TWO SMILES as one: "Nc1nc(NC2CC2)c3ncn([C@@H]4C[C@H](CO)C=C4)c3n1,Nc1nc(NC2CC2)c2ncn([C@H]3C=C[C@@H](CO)C3)c2n1"

So there is not anything to add, when is passed as a LIST it works. You can test with a single smiles: "Nc1nc(NC2CC2)c3ncn([C@@H]4C[C@H](CO)C=C4)c3n1" or a LIST "['Nc1nc(NC2CC2)c3ncn([C@@H]4C[C@H](CO)C=C4)c3n1','Nc1nc(NC2CC2)c2ncn([C@H]3C=C[C@@H](CO)C3)c2n1']"

Really sorry for that guys. I will test with one

samuelmaina commented 1 year ago

@GemmaTuron The model is working. Sorry for that I will be careful next time. I run on colab that why I have submitted a picture. image

emmakodes commented 1 year ago

@simrantan @GemmaTuron

DockerHub

I was able to pull the model image and make predictions via Dockerhub eos8h6g_docker_pred.csv