ersilia-os / ersilia

The Ersilia Model Hub, a repository of AI/ML models for infectious and neglected disease research.
https://ersilia.io
GNU General Public License v3.0
224 stars 148 forks source link

🐛 Bug: Model `eos2ta5` ; Colab - Numpy module not found #349

Closed Zainab-ik closed 1 year ago

Zainab-ik commented 2 years ago

Describe the bug.

The model eos2ta5 repeatedly fails while fetching using the ersilia -v fetch eos2ta5 > eos2ta5E.log 2>&1. It produces an error stating:

EmptyOutputError

Error message:

Model API eos2ta5:predict did not produce an output

Describe the steps to reproduce the behavior

Expected behavior.

Fetch model succesfully.

Screenshots.

eos2ta5E.log

Attached is the log file containing the output while running.

Operating environment

Ubuntu 20.04.1 LTS Windows 11

Additional context

All packages are installed and connected to the internet.

Zainab-ik commented 2 years ago

@GemmaTuron Kindly check, thank you.

GemmaTuron commented 2 years ago

Hi @Zainab-ik !

Seems we have an internet connection issue, probably due to a VPN or proxy network. See my comments in #352 Also maybe increasing buffer size can help? https://stdworkflow.com/877/error-rpc-failed-curl-56-gnutls-recv-error-54-error-in-the-pull-function

GemmaTuron commented 2 years ago

@Zainab-ik as we have discussed in Slack, try this model in Google Colab, and if it is working we can close this issue

Zainab-ik commented 2 years ago

@GemmaTuron Currently on it in colab. Thank you

Zainab-ik commented 2 years ago

@GemmaTuron After running in colab, I got an error

ModuleNotFoundError: No module named 'numpy' Since the error has to do with no module found, I tried importing numpy manually but still resulted in the same error.

image

Here is the link to the Colab eos2ta5Colab I run it on 2 different OS, Ubuntu 20.04 and a macOS.

Since this model fails on both CLI and colab, I suggest having someone else try it and see if the issue still persists. I believe the issue is from the model itself.

For every running session, I delete runtime and disconnect before starting

Zainab-ik commented 2 years ago

Describe the bug.

The model eos2ta5 repeatedly fails while fetching using the ersilia -v fetch eos2ta5 > eos2ta5E.log 2>&1. It produces an error stating:

EmptyOutputError

Error message:

Model API eos2ta5:predict did not produce an output

Describe the steps to reproduce the behavior

  • open WSL
  • activate ersilia environment
  • run ersilia -v fetch eos2ta5 > eos2ta5E.log 2>&1

Expected behavior.

Fetch model succesfully.

Screenshots.

eos2ta5E.log

Attached is the log file containing the output while running.

Operating environment

Ubuntu 20.04.1 LTS Windows 11

Additional context

All packages are installed and connected to the internet.

Operating environment

I also used macOS CLI and encountered the same error

Zainab-ik commented 2 years ago

@GemmaTuron After running in colab, I got an error

ModuleNotFoundError: No module named 'numpy' Since the error has to do with no module found, I tried importing numpy manually but still resulted in the same error.

image

Here is the link to the Colab eos2ta5Colab I run it on 2 different OS, Ubuntu 20.04 and a macOS.

Since this model fails on both CLI and colab, I suggest having someone else try it and see if the issue still persists. I believe the issue is from the model itself.

For every running session, I delete runtime and disconnect before starting

Along the log file, I observed some modules were uninstalled and installed back. However, numpy which was uninstalled wasn't installed back. image

This likely is a cause of the error.

Zainab-ik commented 2 years ago

@pauline-banye So for this model, I tried the solution you suggested in #380 about importing a module not found but it didn't work.

Zainab-ik commented 2 years ago

@GemmaTuron

GemmaTuron commented 2 years ago

Thanks @Zainab-ik !

This error is not consistent (I can get over it on my CLI, and sometimes on Colab) but it does pop an error sometimes around the numpy package, even after deleting the runtime and restarting. Flagging this for further checking - mark in purple in excel

GemmaTuron commented 2 years ago

Can you change the title to: eos2ta5 - Colab - Numpy module not found

Zainab-ik commented 2 years ago

Hi @GemmaTuron title changed.

Zainab-ik commented 2 years ago

Thanks @Zainab-ik !

This error is not consistent (I can get over it on my CLI, and sometimes on Colab) but it does pop an error sometimes around the numpy package, even after deleting the runtime and restarting. Flagging this for further checking - mark in purple in excel

@GemmaTuron Thanks. I'd like to update you that I can over the error in CLI now after going through the process as explained in #372 and I've been able to fetch and predict. The numpy error still stands for the colab. although, there is a little change to the environment I run the model.

GemmaTuron commented 2 years ago

Thanks @Zainab-ik !

We'll have a look at the colab error and get back to you. Leave it in purple on the excel file please.

carcablop commented 1 year ago

Hello @GemmaTuron.

I have been working on this bug in google colab. The initial error is, when fetching the model, a dependency conflict error is generated. More specifically, the error is with the h5py dependency, the one it has is incompatible with the one required by ersilia

Image

Being a pip dependency problem, try upgrade pip. Then try to install in a code block the h5py dependency with the version that ersilia needed. This did not work for me, the error continued. Apart from that I got more errors like: 15:47:19 | DEBUG | Error occurred while running: bash /tmp/ersilia-q4dvi032/script.sh > /tmp/ersilia-k92k1yy7/installs.log 2>&1

Error: Failed to call git rev-parse --git-dir --show-toplevel: "fatal: not a git repository (or any of the parent directories): .git\n" Git LFS initialized.

Therefore I have decided to install everything from scratch, reinstall conda and its dependencies keeping the python version at 3.7. This worked for me, and I no longer have h5py dependency errors. I share the template below: https://colab.research.google.com/drive/1XL0B1KKr3l0jW1x8glm0GFuNKVaqubRt#scrollTo=zpfFj8tkPYO7

Now I have a numpy error, ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behavior is the source of the following dependency conflicts. tensorflow 2.3.1 requires numpy<1.19.0,>=1.16.0, but you have numpy 1.21.6 which is incompatible.

So change the version of numpy to 1.18.5.

Considering that I have a version of python pre-installed in Google colab (python 3.8), check the pre-installed packages, and numpy was with version 1.21.6. So I also changed it to version 1.18.5. Verify and list the packages and it was effectively changed. But the numpy error when I fetch it keeps coming up, even though I changed it. The error that comes out at the end:

Image

carcablop commented 1 year ago

Hello @GemmaTuron and @miquelduranfrigola. I'm going to mention some of the things I've tried to deal with this problem.

  1. I run the model in google colab, but it does not fetch successfully, dependency errors appear with h5py and numpy, their versions are not compatible with the version of TensorFlow that is configured in the docker file to run that model.
  2. I check that in the CLI it is executed successfully and predictions are made. I focus on the problem in the google colab: 2.1. I create a new template in google colab, go back and install everything needed to fetch the model. When executing the fetch I have no errors with the h5py dependency, now I have an error with the version of NumPy incompatible with TensorFlow. 2.1 I installed the version of NumPy compatible with the version of TensorFlow in a block of code before doing the fetch. This didn't work, and the problem persists. 2.3 I modify the docker file of the model. Updating the tensor flow version to the most recent. Tensor flow= 2.9.2. Because this implied making many more changes to the scripts of the original model, one of them was updating Keras, which also implied changing the way of importing the libraries. This solution was more complex. Try going back to the original problem. 2.4 I went back to the previous TensorFlow version and added to the docker file the h5py version compatible with TensorFlow. And when I ran the model again, it fetched successfully, although the dependency errors with h5py and numpy kept coming up. 2.5. In the CLI I also fetched successfully, but this time checking the log I could see that the dependency errors with h5py and numpy also came out. imagen

imagen

imagen

As a conclusion, I could see that the model does have a dependency problem with h5py and numpy for the version of TensorFlow initially configured in the docker file (2.3.1), although in the CLI it is executed successfully, in Google colab, this is sometimes executed and sometimes it is not successful. These dependency problems may be affecting the execution of the model in google colab. 2.6. Predictions were made, I could see in the output file, that for some inputs the model is not capable of making the prediction. I share the output files when predicting. eos2ta5.csv log_eost2a5.log

I don't know if this is due to the problem with the TensorFlow version and the incompatibility problem with the other dependencies (h5py and NumPy). @miquelduranfrigola I would like to be able to make sure that the model is making the predictions correctly, what molecules could it prove that we already know the results? If this model is not making the predictions well, give me the opportunity to review what the errors are and try a solution where I have to modify only the docker file, before changing the version of TensorFlow to the most updated one, since that implies more changes in the code. In google colab, I also tried to delete the execution time, and I also tried to re-run the execution environment from the "execution environment in google colab" option. This didn't work either

carcablop commented 1 year ago

Update. 1 To fix the error, the Ersilia google colab template was modified here. The path for the Pythonpath variable was not specified correctly. Therefore, the PYTHONPATH environment variable was modified like this: %env PYTHONPATH= "$PYTHONPATH:/usr/local/lib/python3.7/site-packages". With the above change, this issue is solved and the model can be successfully fetched both in the CLI and in Google colab.

Zainab-ik commented 1 year ago

Glad to know this issue has been resolved. Been keeping tabs. Thanks @carcablop