Closed jeremycheminf closed 1 month ago
Hi @jeremycheminf
Thanks for this report. Can you show an example of the -csv file you are getting from the example function? and is the path to the file correct in the -i
?
The file is like this
CC[C@H](C)[C@H](NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](CCC(O)=O)NC(=O)[C@H](Cc1ccccc1)NC(=O)[C@H](CC(O)=O)NC(=O)CNC(=O)[C@H](CC(N)=O)NC(=O)CNC(=O)CNC(=O)CNC(=O)CNC(=O)[C@@H]1CCCN1C(=O)[C@H](CCCNC(N)=N)NC(=O)[C@@H]1CCCN1C(=O)[C@H](N)Cc1ccccc1)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](CCC(O)=O)C(=O)N[C@@H](Cc1ccc(O)cc1)C(=O)N[C@@H](CC(C)C)C(O)=O CC(C)c1cc(nc(N)n1)-c1ccc(F)c2ccccc12 C[C@H](CCC(O)=O)[C@H]1CC[C@H]2[C@H]3[C@H](CC(=O)[C@]12C)[C@@]1(C)CCC(=O)C[C@H]1CC3=O [H][C@@]1(C[C@@](C)(OC)[C@@H](O)[C@H](C)O1)O[C@H]1[C@H](C)[C@@H](O[C@]2([H])O[C@H](C)C[C@@H]([C@H]2O)N(C)C)[C@](C)(O)C[C@@H](C)N(CCC)C[C@H](C)[C@@H](O)[C@](C)(O)[C@@H](CC)OC(=O)[C@@H]1C Cc1cn([C@H]2C[C@H](F)[C@@H](CO)O2)c(=O)[nH]c1=O
and yes the file is in the path. If I try with
'ersilia api run -i "C1=C(SC(=N1)SC2=NN=C(S2)N)N+[O-]"'
I also get the same error
Hi @jeremycheminf
This looks like the right file, testing with one molecule is always good practice, thanks. I just noticed you are not serving the model before trying to run predictions? that might be the cause of the issues
Please make sure after fetch you bring the model alive by ersilia serve <modelname>
Other things to look at:
--from_dockerhub
at the fetch command. Still S3 or GitHub should work fine.> out.log 2>&1
Thank you I added the server line and re-fetched from docker: I attached the log file out.log
Hi @jeremycheminf
It seems something went amiss when you set up Ersilia locally. Can I refer you to a very similar issue solved in #820 that some new interns are working on? It might provide the answer!
Hi @jeremycheminf
Can you share with me the packages listed in the ersilia env with conda list
? I want to see if there are any dependencies that might be causing the clash. I do not have a WSL system to test right now and help debugging
thanks!
Hi This is the list, I have yet to try https://github.com/ersilia-os/ersilia/issues/820 which has the same error and same idea around changing some of the code. So that should work when I try.
packages in environment at /home/jeremy/mambaforge/envs/ersilia:
#
Name Version Build Channel
_libgcc_mutex 0.1 conda_forge conda-forge _openmp_mutex 4.5 2_gnu conda-forge attrs 21.4.0 pypi_0 pypi boto3 1.28.52 pypi_0 pypi botocore 1.31.52 pypi_0 pypi bzip2 1.0.8 h7f98852_4 conda-forge ca-certificates 2023.7.22 hbcca054_0 conda-forge certifi 2023.7.22 pypi_0 pypi charset-normalizer 3.2.0 pypi_0 pypi chembl-webresource-client 0.10.8 pypi_0 pypi click 8.1.7 pypi_0 pypi docker 6.1.3 pypi_0 pypi dockerfile-parse 2.0.1 pypi_0 pypi easydict 1.10 pypi_0 pypi emoji 2.8.0 pypi_0 pypi ersilia 0.1.27 pypi_0 pypi h5py 3.7.0 pypi_0 pypi idna 3.4 pypi_0 pypi inputimeout 1.0.4 pypi_0 pypi isaura 0.1 pypi_0 pypi itsdangerous 2.1.2 pypi_0 pypi jmespath 1.0.1 pypi_0 pypi ld_impl_linux-64 2.40 h41732ed_0 conda-forge libffi 3.4.2 h7f98852_5 conda-forge libgcc-ng 13.2.0 h807b86a_2 conda-forge libgomp 13.2.0 h807b86a_2 conda-forge libnsl 2.0.0 h7f98852_0 conda-forge libsqlite 3.43.0 h2797004_0 conda-forge libuuid 2.38.1 h0b41bf4_0 conda-forge libzlib 1.2.13 hd590300_5 conda-forge loguru 0.6.0 pypi_0 pypi ncurses 6.4 hcb278e6_0 conda-forge numpy 1.26.0 pypi_0 pypi openssl 3.1.3 hd590300_0 conda-forge packaging 23.1 pypi_0 pypi pillow 10.0.1 pypi_0 pypi pip 23.2.1 pyhd8ed1ab_0 conda-forge pyairtable 1.5.0 pypi_0 pypi python 3.10.12 hd12c33a_0_cpython conda-forge python-dateutil 2.8.2 pypi_0 pypi pyyaml 6.0.1 pypi_0 pypi rdkit-pypi 2022.9.5 pypi_0 pypi readline 8.2 h8228510_1 conda-forge requests 2.31.0 pypi_0 pypi requests-cache 0.7.5 pypi_0 pypi s3transfer 0.6.2 pypi_0 pypi setuptools 68.2.2 pyhd8ed1ab_0 conda-forge six 1.16.0 pypi_0 pypi tk 8.6.12 h27826a3_0 conda-forge tqdm 4.66.1 pypi_0 pypi tzdata 2023c h71feb2d_0 conda-forge url-normalize 1.4.3 pypi_0 pypi urllib3 1.26.16 pypi_0 pypi validators 0.21.2 pypi_0 pypi websocket-client 1.6.3 pypi_0 pypi wheel 0.41.2 pyhd8ed1ab_0 conda-forge xz 5.2.6 h166bdaf_0 conda-forge
@jeremycheminf
Yes, related to that, changing the code will bypass the error but then the predictions will return null - so don't go that route. It is an issue with installation on WSL we have not been able to pinpoint but I'm working on this, will let you know!
Also, in case it is helpful aside from Google Colab you can use GitHub codespaces as we discussed, simply go on the right hand of the /ersilia repository, click <> Code
and select the CodeSpaces
option. This will set up a Codespace where ersilia is installed (you can check with the command ersilia --help
) and you can fetch, serve and run models.
Please note that Codespaces use the individual free tier of GitHub users (60h/month) so make sure to terminate it once done.
We haven't yet written extensive documentation for that since we are trying out its functionalities still
Hi @jeremycheminf
I think we have identified the source of the error - it is due to compatibility with Isaura, which is our backend for caching predictions (mostly only needed if you use the models intensively) - can you delete the conda environment and try the installation again without installing Isaura? We should remove that from the docs as it is not a requirement, only a nice-to-have. Kudos to @carcablop for identifying the source of error - we are now working on fixing it!
Hi
This worked with also using
ersilia -v run -i "CCCC"
However the command ersilia -v run -i my_molecules.csv -o my_predictions.csv gave this error
19:41:18 | DEBUG | Getting session from /home/jeremy/eos/session.json
19:41:18 | DEBUG | Getting session from /home/jeremy/eos/session.json
19:41:18 | WARNING | Lake manager 'isaura' is not installed! We strongly recommend installing it to store calculations persistently
19:41:18 | ERROR | Isaura is not installed! Calculations will be done without storing and reading from the lake, unfortunately.
19:41:19 | DEBUG | Is fetched: True
19:41:19 | DEBUG | Schema available in /home/jeremy/eos/dest/eos2r5a/api_schema.json
19:41:19 | DEBUG | Setting AutoService for eos2r5a
19:41:19 | INFO | Service class provided
19:41:19 | DEBUG | Using port 49345
19:41:19 | DEBUG | Starting Docker Daemon service
19:41:19 | DEBUG | Creating temporary folder /tmp/ersilia-hgf1up2i and mounting as volume in container
19:41:19 | DEBUG | Image ersiliaos/eos2r5a:latest is available locally
19:41:19 | DEBUG | Using port 37459
19:41:19 | DEBUG | Starting Docker Daemon service
19:41:19 | DEBUG | Creating temporary folder /tmp/ersilia-enu0uxmv and mounting as volume in container
19:41:19 | DEBUG | Reading card from eos2r5a
19:41:19 | DEBUG | Trying to get metadata from: /home/jeremy/eos/dest/eos2r5a
19:41:20 | DEBUG | Reading shape from eos2r5a
19:41:20 | DEBUG | Trying to get metadata from: /home/jeremy/eos/dest/eos2r5a
19:41:21 | DEBUG | Input Shape: Single
19:41:21 | DEBUG | Input type is: compound
19:41:21 | DEBUG | Input shape is: Single
19:41:21 | DEBUG | Importing module: .types.compound
19:41:21 | DEBUG | Checking RDKIT and other requirements necessary for compound inputs
19:41:21 | DEBUG | InputShapeSingle shape: Single
19:41:21 | DEBUG | Expected number: 1
19:41:21 | DEBUG | Entity is list: False
19:41:21 | DEBUG | Resolving columns
19:41:21 | DEBUG | Number of columns seems to be 1: assuming input is the only column: {'input': [0], 'key': None}
19:41:21 | DEBUG | Candidate header is ['CCC@HC@HNC(=O)C@HNC(=O)C@HNC(=O)C@HNC(=O)CNC(=O)C@HNC(=O)CNC(=O)CNC(=O)CNC(=O)CNC(=O)[C@@H]1CCCN1C(=O)C@HNC(=O)[C@@H]1CCCN1C(=O)C@HCc1ccccc1)C(=O)N1CCC[C@H]1C(=O)NC@@HC(=O)NC@@HC(=O)NC@@HC(=O)NC@@HC(O)=O']
19:41:21 | DEBUG | Matching for input is [0]
19:41:21 | DEBUG | Has header False
19:41:21 | DEBUG | Schema {'input': [0], 'key': None}
Traceback (most recent call last):
File "/home/jeremy/mambaforge/envs/ersilia/bin/ersilia", line 8, in
So header comes back again, but I'll try to fix it with having an actual header in the file
Adding SMILES at the top of the csv file worked to get all the predictions. So looks like on wsl the csv must have a header
Hi @GemmaTuron - is the issue resolved?
It is if you add the header on the .csv file for WSL - we can close this issue but maybe we should make sure this is specified in the documentation if it is not
@GemmaTuron maybe I need to check this, but shouldn't every csv (whether on WSL, Linux, or Mac) should have a header? I'm not sure if it's a WSL specific issue, or even ersilia specific. I'll check this and get back. Whatever is the case, we should update the documentation and close this issue.
Hi @DhanshreeA
They should all have a header, but in case they don't, Ersilia should be able to process them? I don't know, I'd go for making all .csv files have a header
Hi @DhanshreeA and @miquelduranfrigola
Can we clarify if Ersilia requires the passing of a .csv with a header or not ?
In principle, Ersilia does not require a header. It automatically inspects the input file. Please let me know if we've lost this functionality for some reason.
I'll check this and report.
The models work without header and with header and they produce the right output in my Ubuntu system.
I'll close this issue and if it arises again from any user we will revisit it
Describe the bug.
When following the documentation and installing fresh env, the prediction returns error TypeError: object of type 'NoneType' has no len(). Fetching model itself is working.
Describe the steps to reproduce the behavior
conda create -n ersilia python=3.10 conda activate ersilia python -m pip install isaura==0.1 git clone https://github.com/ersilia-os/ersilia.git cd ersilia pip install -e . ersilia fetch retrosynthetic-accessibility ersilia example retrosynthetic-accessibility -n 5 -f my_molecules.csv ersilia run -i my_molecules.csv -o my_predictions.csv
Outcome:
Traceback (most recent call last): File "/home/jeremy/mambaforge/envs/ersilia/bin/ersilia", line 8, in <module> sys.exit(cli()) File "/home/jeremy/mambaforge/envs/ersilia/lib/python3.10/site-packages/click/core.py", line 1157, in __call__ return self.main(*args, **kwargs) File "/home/jeremy/mambaforge/envs/ersilia/lib/python3.10/site-packages/click/core.py", line 1078, in main rv = self.invoke(ctx) File "/home/jeremy/mambaforge/envs/ersilia/lib/python3.10/site-packages/click/core.py", line 1688, in invoke return _process_result(sub_ctx.command.invoke(sub_ctx)) File "/home/jeremy/mambaforge/envs/ersilia/lib/python3.10/site-packages/click/core.py", line 1434, in invoke return ctx.invoke(self.callback, **ctx.params) File "/home/jeremy/mambaforge/envs/ersilia/lib/python3.10/site-packages/click/core.py", line 783, in invoke return __callback(*args, **kwargs) File "/home/jeremy/ersilia/ersilia/cli/commands/__init__.py", line 22, in wrapper return func(*args, **kwargs) File "/home/jeremy/ersilia/ersilia/cli/commands/run.py", line 34, in run result = mdl.run(input=input, output=output, batch_size=batch_size) File "/home/jeremy/ersilia/ersilia/core/model.py", line 144, in _method return self.api(api_name, input, output, batch_size) File "/home/jeremy/ersilia/ersilia/core/model.py", line 335, in api if self._do_cache_splits(input=input, output=output): File "/home/jeremy/ersilia/ersilia/core/model.py", line 320, in _do_cache_splits self.tfr = TabularFileReader( File "/home/jeremy/ersilia/ersilia/io/readers/file.py", line 570, in __init__ self._standardize() File "/home/jeremy/ersilia/ersilia/io/readers/file.py", line 574, in _standardize tfss = TabularFileShapeStandardizer( File "/home/jeremy/ersilia/ersilia/io/readers/file.py", line 409, in __init__ self.read_input_columns() File "/home/jeremy/ersilia/ersilia/io/readers/file.py", line 321, in read_input_columns if len(h) == 1: TypeError: object of type 'NoneType' has no len()
Expected behavior.
Prediction with score for the molecule. I managed to run the code on google colab using the notebook provided on the documentation, but locally I can not get the tool to work.
Screenshots.
No response
Operating environment
WSL2 - Ubuntu 22.04.2 LTS
Additional context
No response