ersilia-os / eos2re5

A platform for systematic ADME evaluation of drug molecules
GNU General Public License v3.0
2 stars 1 forks source link

New model ready for testing! #4

Open github-actions[bot] opened 1 year ago

github-actions[bot] commented 1 year ago

This model is ready for testing. If you are assigned to this issue, please try it out using the CLI, Google Colab and DockerHub and let us know if it works!

HellenNamulinda commented 1 year ago

Hello @ZakiaYahya and @GemmaTuron, this model is failing on Colab with

CommandNotFoundError: Your shell has not been properly configured to use 'conda activate'.
To initialize your shell, run

    $ conda init <SHELL_NAME>

Currently supported shells are:
  - bash
  - fish
  - tcsh
  - xonsh
  - zsh
  - powershell

See 'conda init --help' for more information and options.

IMPORTANT: You may need to close and restart your shell after running 'conda init'.

Traceback (most recent call last):
  File "/root/eos/repository/eos2re5/20230625163913_EAD174/eos2re5/artifacts/framework/code/main.py", line 14, in <module>
    from sklearn.externals import joblib
ModuleNotFoundError: No module named 'sklearn'

CommandNotFoundError: Your shell has not been properly configured to use 'conda deactivate'.
To initialize your shell, run

    $ conda init <SHELL_NAME>

Currently supported shells are:
  - bash
  - fish
  - tcsh
  - xonsh
  - zsh
  - powershell

On CLI, it's asking for root access. This is so because of the sudo commands in docker file

RUN sudo apt update
RUN sudo apt install python2.7
RUN sudo apt install gfortran-7

And even after granting it, it takes forever(eos2re5_cli_fetch.log

(ersilia) hellenah@hellenah-elitebook:~$ ersilia -v fetch eos2re5 > eos2re5_cli_fetch.log 2>&1
[sudo] password for hellenah: 

@ZakiaYahya, did you test it locally after the changes and it worked? Kindly advise. I would have tried the docker image, but it is not yet available.

ZakiaYahya commented 1 year ago

Hello @HellenNamulinda Thanks for testing, yes it's working on my system both locally and even inside ersilia with --repo-path. Yes, with sudo it ask for grant access locally but when i test it within ersilia it didn't ask for grant access even at fetch time. Let me check. Thanks.

GemmaTuron commented 1 year ago

Hi @ZakiaYahya and @miquelduranfrigola Are you working on this model? what is the status?

ZakiaYahya commented 1 year ago

Hello @GemmaTuron Yes, I have a detail meeting with @miquelduranfrigola later today, So, once we discuss on it i'll update you. Thanks

GemmaTuron commented 1 year ago

This model is working, thanks for the excellent job @miquelduranfrigola and @ZakiaYahya and @DhanshreeA

Just to be sure, @HellenNamulinda can you run one test?

HellenNamulinda commented 1 year ago

Hi @GemmaTuron and @ZakiaYahya , I tested the model using Colab and Docker. It fetches well. 👍 Model eos2re5 fetched successfullyeos2re5_cli_fetch.log

However, the first value in the output(its column name: smiles) is null. Probably because it's a string yet output type is Float.

Test file: test.csv Colab Output: eos2re5_colab_output.csv Docker Output: 2re5_docker_output.csv

+ [ -z eos2re5 ]
+ ersilia serve -p 3000 eos2re5
🚀 Serving model eos2re5: admetlab

   URL: http://127.0.0.1:3000
   PID: 37
   SRV: conda

👉 To run model:
   - run

💁 Information:
   - info
Serving model eos2re5...
+ echo Serving model eos2re5...
root@4c9873bdf8ac:~# ersilia run -i "FC(F)Oc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"
{
    "input": {
        "key": "MLBNXJTXHVBPEC-UHFFFAOYSA-N",
        "input": "FC(F)Oc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1",
        "text": "FC(F)Oc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"
    },
    "output": {
        "outcome": [
            null,
            1.0,
            0.9527145777812469,
            1.0,
            0.7436851435428148,
            1.0,
            0.626,
            1.0,
            0.735543694320005,
            0.0,
            0.41,
            1.0,
            0.9161980942384855,
            1.0,
...
ZakiaYahya commented 1 year ago

Hello @GemmaTuron @HellenNamulinda So, going through Hellen's comment. I've checked it by fetching the model in CLI and DockerHub and yes the output is showing an extra column of smiles, which is empty btw. By going through the code main.py i get to know that yes the smile column is storing in output dataframe i.e. L90

outcome_df = pd.DataFrame({'smiles': smiles_list})

But the question is, why it is not storing and showing smiles in that column. I'm working on it by clonning the repo again and test it with run.sh. So, i'll try to tackle two issues here, (1) If smiles are storing in output_dataframe in main.py then why it is showing null in output file. (2) Secondly, i'll try to discard that column as smiles are already showing in output file as input, so i think there is no need to show an extra column in the output.

Thanks.

HellenNamulinda commented 1 year ago

Hi @ZakiaYahya, This piece of code in service.py converts all values in each row float.

for r in reader:
                R += [{"outcome": [Float(x) for x in r]}]

But since smiles are strings, the Float conversion won't work(no value is returned) The column for smiles(first column probably) can be excluded like you said because ersilia has that column already. And just include from the second column;

for r in reader:
                R += [{"outcome": [Float(x) for x in r[1:]}]
ZakiaYahya commented 1 year ago

Hi @HellenNamulinda Thanks for pointing it out. But there is nothing wrong with service.py , it's the main.py which actually stores the smiles in a dataframe as a separate column, which ofcourse we don't need in case of Ersilia. As Ersilia automatically appends input column which is smiles. So i just go with discarding that thing from main.py and will push the changes again. I'll let you know once it done. Then you will test it again. Thanks

ZakiaYahya commented 1 year ago

Hello @GemmaTuron @HellenNamulinda I've made following changes to resolve the first null-output-entry issue in output. Removed the storing of smiles in a dataframe in main.py, just simply declare the dataframe like this

From outcome_df = pd.DataFrame({'smiles': smiles_list})

To outcome_df = pd.DataFrame()

I've push the changes and Open PR, once the problem of py-3.11-Github-Runner issue is resolved and changes are merged, @HellenNamulinda can you kindly test it again and confirm the results.

Thanks.

GemmaTuron commented 1 year ago

Hi @HellenNamulinda and @pittmanriley

Please check that the latest changes are working, thanks!

pittmanriley commented 1 year ago

Hi @GemmaTuron @ZakiaYahya , I tested the model on CLI, Colab, and Docker. It works on Colab and Docker, but I'm getting null outputs when I run it in CLI. Also, it took too long to run 20 inputs on Colab, so I just decided to try it on a single input and it worked. Here are the outputs:

CLI: eos2re5.csv Colab: eos2re5_colab.csv Docker: eos2re5_docker.log

GemmaTuron commented 1 year ago

IS the CLI in CodeSpaces @pittmanriley ?

pittmanriley commented 1 year ago

@GemmaTuron yes I just tried it on Codespaces and it seems to work, although one of the outputs is null while the others work. Here's the output: eos2re5.csv

GemmaTuron commented 1 year ago

I think the CLI issue is related to the platform not the model itself, does it happen in other models as well?

HellenNamulinda commented 1 year ago

Hi @GemmaTuron and @ZakiaYahya, The model works well in Colab;

eos2re5_colab_output (1).csv

For CLI, it's extremely slow on my machine, but works well when the commands are run on Colab

!ersilia run -i "FC(F)Oc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"
{
    "input": {
        "key": "MLBNXJTXHVBPEC-UHFFFAOYSA-N",
        "input": "FC(F)Oc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1",
        "text": "FC(F)Oc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"
    },
    "output": {
        "outcome": [
            1.0,
            0.9527145777812469,
            1.0,
            0.7436851435428148,
            1.0,
            0.626,
            1.0,
            0.735543694320005,
            0.0,
            0.41,
            1.0,
            0.9161980942384855,
...

I pulled the docker image, and it still had a null output for the first value; eos2re5_docker_output.csv. But I saw the docker image wasn't updated because the build failed.

root@c46091121ccd:~# ersilia -v run -i "FC(F)Oc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"
{
    "input": {
        "key": "MLBNXJTXHVBPEC-UHFFFAOYSA-N",
        "input": "FC(F)Oc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1",
        "text": "FC(F)Oc1ccc(-c2nnc3cncc(Oc4ccc5ccsc5c4)n23)cc1"
    },
    "output": {
        "outcome": [
            null,
            1.0,
            0.9527145777812469,
            1.0,
            0.7436851435428148,
...
ZakiaYahya commented 1 year ago

Yes @HellenNamulinda It has the previous image on DockerHub, The latest one fails to "Upload on DockerHub"

pittmanriley commented 1 year ago

@GemmaTuron I also had some issues fetching eos8451 in Codespaces, but other than that Codespaces hasn't given me any issues regarding null outputs.

GemmaTuron commented 1 year ago

The CodeSpaces nulls seem to be platform related not model dependant. I am trying to see where the DockerHub build is failing but I cannot load the log error. @ZakiaYahya have you been able to look into it?

GemmaTuron commented 1 year ago

Ok, the log is finally charging. This is what it indicates at the bottom:

#6 4867.7 14:02:09 | ERROR    | ❗Could not download file https://ersilia-models.s3.eu-central-1.amazonaws.com/eos2re5/model/checkpoints/T/T_Model.pkl_739.npy from S3 bucket.
#6 4867.7  We will try Git LFS.
#6 4867.7 14:02:09 | ERROR    | ❌ Checksum discrepancy in file model/checkpoints/T/T_Model.pkl_739.npy: expected 88fe6e055898dad9654d02c9d5357c2b164051da2be5df825065a964c626900c, actual 21fefec75bf4c284399ab55a704e9398ad32747236e871823bb022b8848f7f19
#6 5338.8 14:10:00 | ERROR    | ❗Could not download file https://ersilia-models.s3.eu-central-1.amazonaws.com/eos2re5/model/checkpoints/VD/VD_Model.pkl_1271.npy from S3 bucket.
#6 5338.8  We will try Git LFS.
#6 5338.8 14:10:00 | ERROR    | ❌ Checksum discrepancy in file model/checkpoints/VD/VD_Model.pkl_1271.npy: expected f56819c9aa660eb423312ace24f65a8077e8786119d2b48f74f968b9f7cf215c, actual c6b00395c48369b197b1dde5c9dae137ffbcc8790c86b9f218c0fb8a8403d08a
Error: The operation was canceled.

I am unsure if this could be a limit in size, perhaps we should try building this model from local @miquelduranfrigola ?

GemmaTuron commented 1 year ago

Hi,

I am trying to finish off this model, but I am unable to run it. I've cloned the repo to my local Ubuntu machine and tried to fetch it with the --repo_path, but I end up with the following issue:

Model API eos2re5:run did not produce an outputTraceback (most recent call last):
  File "/home/gturon/eos/repository/eos2re5/20230824130055_5661E0/eos2re5/artifacts/framework/code/main.py", line 20, in <module>
    from chemopy.src.pychem.pychem import PyChem2d
  File "/home/gturon/eos/repository/eos2re5/20230824130055_5661E0/eos2re5/artifacts/framework/chemopy/src/pychem/pychem.py", line 34, in <module>
    import cpsa
  File "/home/gturon/eos/repository/eos2re5/20230824130055_5661E0/eos2re5/artifacts/framework/chemopy/src/pychem/cpsa.py", line 17, in <module>
    from openbabel import pybel
ImportError: cannot import name pybel

If I open the conda env eos2re5-py27 and try: import openbabel -- works from openbabel import pybel--does not work import pybel-- works

I've tried changing all the imports to import pybel, but then when I fetch from repo path I get the contrary, from openbabel import pybel works in the env but import pybel does not.

This model is extremely problematic and large, and uses a very outdated version of Python. I'd suggest removing it from the Hub to avoid spending more time on this - @miquelduranfrigola ?

Here you have one of the logs of the times I've tried. Openbabel seems to install fine. eos2re5.log