ersilia-os / eos74bo

GNU General Public License v3.0
0 stars 1 forks source link

List of Dependencies #13

Closed paulinebanye closed 1 year ago

paulinebanye commented 1 year ago

Good morning @GemmaTuron, The issue still persists with the checks, I have outlined the dependencies in the environment.yml file, the dependencies and versions in the activated environment below.

name: eos74bo
channels:
  - conda-forge
  - defaults
dependencies:
  - python=3.8
  - pip=20.0
  - pip:
    - numpy
    - pandas
    - rdkit
    - torch
    - FPSim2
    - tqdm
    - typing-extensions
    - typed-argument-parser
    - tensorboardX
    - scikit-learn
    - hyperopt
    - requests

During our catch up meeting yesterday, @miquelduranfrigola mentioned that this error is due to the docker image not having those dependencies and He asked that I switch the python version to Python 3.8 and include the dependencies. I updated the dockerfile but the checks still fails.

RUN apt-get update && \ apt-get install -y software-properties-common && \ add-apt-repository -y ppa:deadsnakes/ppa && \ apt-get update && \ apt install -y python3.8

RUN pip install rdkit RUN pip install pandas RUN pip install numpy RUN pip install torch==1.6.0+cpu -f https://download.pytorch.org/whl/torch_stable.html

RUN pip install torch

RUN pip install FPSim2 RUN pip install tqdm RUN pip install typing-extensions RUN pip install typed-argument-parser RUN pip install tensorboardX RUN pip install scikit-learn RUN pip install hyperopt RUN pip install requests

WORKDIR /repo COPY . /repo

GemmaTuron commented 1 year ago

Hi @pauline-banye

Thanks, very detailed answer super helpful Just to be clear: the check fails in Git Actions, but if you run it in your computer with the --repo-path flag, does it work?

paulinebanye commented 1 year ago

Hi @GemmaTuron, No it does not. I get errors when I run it with the repopath flag. `ersilia -v fetch eos74bo -r /mnt/c/Users/DELL-PC/Desktop/eos74bo/ > eos74bo.log 2>&1`.

Steps to recreate the error

paulinebanye commented 1 year ago

Update

@GemmaTuron @DhanshreeA The build progressed much further with docker dependencies. The error returned currently is related to the relative imports path within the repo. This is an issue that occured when I was working with the RLM model initially, but it was resolved by including sys.path.append or sys.path.insert within the codebase. I added those as soon as I began working with the solubility model.

RUN pip install rdkit RUN pip install pandas RUN pip install numpy RUN pip install torch==1.6.0+cpu -f https://download.pytorch.org/whl/torch_stable.html

RUN pip install torch

RUN pip install tqdm RUN pip install typing-extensions RUN pip install typed-argument-parser RUN pip install tensorboardX RUN pip install scikit-learn RUN pip install hyperopt RUN pip install requests

WORKDIR /repo COPY . /repo



I am at a loss to understand why it is not working when I test the model within the Ersilia CLI. I am still attempting to debug and figure out what is causing the errors.
[eos74bo_6.log](https://github.com/ersilia-os/eos74bo/files/10712735/eos74bo_6.log)
[eos74bo_8.log](https://github.com/ersilia-os/eos74bo/files/10712736/eos74bo_8.log)
carcablop commented 1 year ago

Hello @pauline-banye . I see in the logs that some modules are not being imported, to do this make sure from your development environment in the code that the imports are being done correctly, that this is the correct path, check that it really is in the folder that corresponds to those . py, for example "predictors", it seems that it is trying to access that folder to import "Solubility predictor", but it cannot find it. Check if you have a __init__.py file and if you have those imports configured there.

And try to run the main.py file, as follows: If you already have an environment configured with the model dependencies installed. Activate the environment and go directly to the main.py file path and run it. When you have solved it you can finally run it with ersilia and repo_path like this: activate the ersilia environment and run: ersilia -v fetch "model_id" --repo_path /home/../model_id

I recommend as a good practice, in the docker file add the versions of each dependency.

paulinebanye commented 1 year ago

Hello @pauline-banye . I see in the logs that some modules are not being imported, to do this make sure from your development environment in the code that the imports are being done correctly, that this is the correct path, check that it really is in the folder that corresponds to those . py, for example "predictors", it seems that it is trying to access that folder to import "Solubility predictor", but it cannot find it. Check if you have a __init__.py file and if you have those imports configured there.

And try to run the main.py file, as follows: If you already have an environment configured with the model dependencies installed. Activate the environment and go directly to the main.py file path and run it. When you have solved it you can finally run it with ersilia and repo_path like this: activate the ersilia environment and run: ersilia -v fetch "model_id" --repo_path /home/../model_id

I recommend as a good practice, in the docker file add the versions of each dependency.

Thank you so much @carcablop. I really appreciate your help checking the model out.

paulinebanye commented 1 year ago

Update

Hi @GemmaTuron, The issues I kept receiving were due to the relative paths. Majority of them were within the chemprop submodule and I have been able to resolve them. I tested the repo and the model was fetched successfully.

ersilia -v fetch eos74bo -r /mnt/c/Users/DELL-PC/Desktop/nu_eos74bo/eos74bo > eos74bo.log 2>&1 eos74bo.log

DhanshreeA commented 1 year ago

That sounds great @pauline-banye ! Could you also describe the issues you faced with chemprop, we can all learn from it and this can be useful information around model incorporation for future contributors.

GemmaTuron commented 1 year ago

Hi @pauline-banye ! Great progress thanks for the detailed feedback. Could you:

paulinebanye commented 1 year ago

Sure @GemmaTuron @DhanshreeA, @carcablop !

Resolving chemprop issues

The issues were mainly related to the way the relative imports in the chemprop submodule were specified in the repository. I kept getting different errors , which prompted me to start debugging each file.

paulinebanye commented 1 year ago

Function which reqired FPSim2

paulinebanye commented 1 year ago

Update

Hi @GemmaTuron

I tested the model via the repo_path method.

Issue

Although the model functioned as expected, I am experiencing the issue with the repo metadata. It returns a Wrong Ersilia model tag error. However I did not edit any of the items in the tag variable.. I have been reviewing the metadata but I haven't been able to figure out what could be causing this error.

@DhanshreeA @carcablop would you mind taking a look at it?

--2023-02-12 18:10:45--  https://raw.githubusercontent.com/ersilia-os/ersilia/master/.github/scripts/update_metadata_to_airtable.py
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.111.133, 185.199.108.133, 185.199.109.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.111.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 536 [text/plain]
Saving to: ‘update_metadata_to_airtable.py’

     0K                                                       100% 40.6M=0s

2023-02-12 18:10:46 (40.6 MB/s) - ‘update_metadata_to_airtable.py’ saved [536/536]

18:10:47 | DEBUG    | Reading from https://raw.githubusercontent.com/pauline-banye/eos74bo/main/metadata.json
18:10:47 | ERROR    | Ersilia exception class:
TagBaseInformationError

Detailed error:
Wrong Ersilia model tag

Hints:
Tags must be in list format and they must be accepted our team. This means that only tags that are already available in Ersilia are allowed. If you want to include a new tag, please open a pull request (PR) on the 'tag.txt' file from the Ersilia repository.

Traceback (most recent call last):
  File "/home/runner/work/eos74bo/eos74bo/update_metadata_to_airtable.py", line 14, in <module>
    data = rm.read_information(org=user_name, branch=branch)
  File "/usr/share/miniconda/lib/python3.10/site-packages/ersilia/hub/content/card.py", line 388, in read_information
    bi.from_dict(data)
  File "/usr/share/miniconda/lib/python3.10/site-packages/ersilia/hub/content/card.py", line 342, in from_dict
    self.tag = data["Tag"]
  File "/usr/share/miniconda/lib/python3.10/site-packages/ersilia/hub/content/card.py", line 238, in tag
    raise TagBaseInformationError
ersilia.utils.exceptions_utils.card_exceptions.TagBaseInformationError: Ersilia exception class:
TagBaseInformationError

Detailed error:
Wrong Ersilia model tag

Hints:
Tags must be in list format and they must be accepted our team. This means that only tags that are already available in Ersilia are allowed. If you want to include a new tag, please open a pull request (PR) on the 'tag.txt' file from the Ersilia repository.

Error: Process completed with exit code 1.
GemmaTuron commented 1 year ago

Hi @pauline-banye

Please revise what you have in the tags with the documentation in gitbook or in the metadata files in the Ersilia Hub. As you know, python strings must be LITERALLY the same - including CAPS

paulinebanye commented 1 year ago

I updated the metadata.json. Unfortunately the checks are still failing.

{    
    "Identifier": "eos74bo",
    "Slug": "aqueous-kinetic-solubility",
    "Status": "In progress",
    "Title": "Aqueous Kinetic Solubility",
    "Description": "Prediction of Aqueous solubility is one of the most important properties in drug discovery, as it has profound impact on various drug properties, including biological activity, pharmacokinetics (PK), toxicity, and in vivo efficacy.",
    "Mode": "Pretrained",
    "Task": ["Classification"],
    "Input": ["Compound"],
    "Input Shape": "Single",
    "Output": ["Probability"],
    "Output Type": ["Float"],
    "Output Shape": "Single",
    "Interpretation": "Probability of a compound being soluble at 10 μg/mL. (>0.5: Soluble), and probability of a compound being highly soluble (>52 μg/mL; >0.5: Soluble)",
    "Tag": [
        "ADME",
        "Solubility"
    ],
    "Publication": "https://slas-discovery.org/article/S2472-5552(22)06765-X/fulltext",
    "Source Code": "https://github.com/ncats/ncats-adme",
    "License": "None"
}
GemmaTuron commented 1 year ago

Hello @pauline-banye !

Where are you updatingt he metadata file? Git Actions checks for the files in your fork of the repository, you can see the link on the Action file: 18:10:47 | DEBUG | Reading from https://raw.githubusercontent.com/pauline-banye/eos74bo/main/metadata.json

Make sure you update that specific metadata.json, since it is still showing the "solubility" and "ADME" tags - this is the last action that was run: https://github.com/ersilia-os/eos74bo/actions/runs/4157600807/jobs/7192215071

GemmaTuron commented 1 year ago

Hi @Femme-js

The metadata.json is still incomplete, please fill in the interpretation and check the licenses format allowed. I've modified the test_model_pr workflow to the latest version so it will be triggered

paulinebanye commented 1 year ago

Hi @GemmaTuron, thank you so much for your help. I eventually resolved it by making the PR directly from the main branch.

I was initially tracking the actions on my forked repository using the dev and main branch until I remembered that you mentioned that it works only on the main branch. So I merged the updated codes to my main branch, which triggered the actions on the Ersilia repo.