ersilia-os / eos74bo

GNU General Public License v3.0
0 stars 1 forks source link

Model Testing #17

Closed GemmaTuron closed 1 year ago

GemmaTuron commented 1 year ago

Hi,

Please test this model works on the Colab and CLI, thanks!

DhanshreeA commented 1 year ago

The model works both on CLI and Colab. Attaching the outputs here: eos74bo_colab_output.csv eos74bo_output_5.csv

But I also see that there's an issue open for the model regarding the outputs. Do we need to test again after the issue is resolved?

karthikjetty commented 1 year ago

Hi! Below is the link to the model fetched and tested in Colab:

https://colab.research.google.com/drive/1TKXHQZC7RdwktwbXH5TSGx9uq9zDBL0E#scrollTo=ipckLYxPS3GY

The model does not work when I run it in my CLI. When I try fetching the model in CLI, I get the following error.

Ersilia exception class: ModelPackageInstallError

Detailed error: Error occured while installing package by running "bash /var/folders/05/jz2x9xjs0yzc5wtg7mnjs6080000gn/T/ersilia-w2ctdpp1/script.sh > /var/folders/05/jz2x9xjs0yzc5wtg7mnjs6080000gn/T/ersilia-n64lko_7/installs.log 2>&1" command

Here is my log file. eos74bo.log

The proposed solution given in the error message is to manually go into the environment and run the command. So, I ran “bash /var/folders/05/jz2x9xjs0yzc5wtg7mnjs6080000gn/T/ersilia-w2ctdpp1/script.sh > /var/folders/05/jz2x9xjs0yzc5wtg7mnjs6080000gn/T/ersilia-n64lko_7/installs.log 2>&1" in the eos74bo condo environment. After doing this and fetching the model, I still got the same error. I will try looking through the .log file more closely to see if there are any specific packages causing the error.

GemmaTuron commented 1 year ago

The model works both on CLI and Colab. Attaching the outputs here: eos74bo_colab_output.csv eos74bo_output_5.csv

But I also see that there's an issue open for the model regarding the outputs. Do we need to test again after the issue is resolved?

No, I think we solved this already, thanks!

GemmaTuron commented 1 year ago

Hi! Below is the link to the model fetched and tested in Colab:

https://colab.research.google.com/drive/1TKXHQZC7RdwktwbXH5TSGx9uq9zDBL0E#scrollTo=ipckLYxPS3GY

The model does not work when I run it in my CLI. When I try fetching the model in CLI, I get the following error.

Ersilia exception class: ModelPackageInstallError

Detailed error: Error occured while installing package by running "bash /var/folders/05/jz2x9xjs0yzc5wtg7mnjs6080000gn/T/ersilia-w2ctdpp1/script.sh > /var/folders/05/jz2x9xjs0yzc5wtg7mnjs6080000gn/T/ersilia-n64lko_7/installs.log 2>&1" command

Here is my log file.

The proposed solution given in the error message is to manually go into the environment and run the command. So, I ran “bash /var/folders/05/jz2x9xjs0yzc5wtg7mnjs6080000gn/T/ersilia-w2ctdpp1/script.sh > /var/folders/05/jz2x9xjs0yzc5wtg7mnjs6080000gn/T/ersilia-n64lko_7/installs.log 2>&1" in the eos74bo condo environment. After doing this and fetching the model, I still got the same error. I will try looking through the .log file more closely to see if there are any specific packages causing the error.

Hello Karthik,

You have had trouble with models that worked fine for others, so I think the issue is in your installation. The first thing is to make sure you are using BASH not ZSH

karthikjetty commented 1 year ago

I did some research online, and I think all files that work in bash should also work in zsh. Moreover,the majority of other models are able to be fetched. There was only one model that I wasn't able to get running on my system.

I looked through the errors and saw that torch might be the package causing the error because the log says “Looking in links: https://download.pytorch.org/whl/torch_stable.html ERROR: Could not find a version that satisfies the requirement torch==1.6.0+cpu (from versions: 0.4.1, 1.0.0, 1.0.1, 1.0.1.post2, 1.1.0, 1.1.0.post2, 1.2.0, 1.3.0, 1.3.0.post2, 1.3.1, 1.4.0, 1.5.0, 1.5.1, 1.6.0, 1.7.0, 1.7.1, 1.8.0, 1.8.1, 1.9.0, 1.9.1, 1.10.0, 1.10.1, 1.10.2, 1.11.0, 1.12.0, 1.12.1, 1.13.0, 1.13.1) ERROR: No matching distribution found for torch==1.6.0+cpu”

When I tried downloading PyTorch, I still got the same error as before. I will go ahead and download the other packages that are required and see if that fixes the issue. Otherwise, I’m not sure where the error might be from.

GemmaTuron commented 1 year ago

Hi @karthikjetty ,

Did you troubleshoot the error you were having on model eos935? I recall this one also was giving problems to you in particular. Regarding Bash and Zsh - most cases it wont make a difference, but because we are really debugging and testing other's code, and all we do is developed in bash, is safer to stay in bash, there are some differences that might be causing errors. Regarding the pytorch error - installing other packages won't make a difference, it is due to incompatibilities with your python paths - can you check which python is it running in the environment, and which python is it using when installing through conda? if you are not you might need to specify pip3 for example - see more information here: https://github.com/pytorch/pytorch/issues/47354

Let me know when you try out

karthikjetty commented 1 year ago

I think it might have to do with that since I am running this on a macbook, there is no package called 1.6.0+CPU. I looked through the pytorch website and the CPU designation is only for linux/windows computers. I tried manually installing torch, but I realized I manually installed 1.13.1. I will try installing 1.6.0 for MacOS and see if that fixes the issue. I checked the python paths and the same one I am running in the environment is the one downloading the file through conda.

Below is a screenshot of where the 1.6.0+CPU option is present for windows whereas for the mac option, there is no such requirement.

Screenshot 2023-02-15 at 8 43 37 PM

I clicked on the link you provided and see that many people are experiencing the same issue and installing it using different methods seem to provide a solution. All the people experiencing the issue have windows computers, while the people with macs aren't downloading the CPU version. I will try some variations provided in the link, but I think it should work if I run the non-CPU version. I experienced the same issue while trying to test eos93h2, so if I fix this, I should be able to run both of the models.

GemmaTuron commented 1 year ago

Fromt his: https://pytorch.org/get-started/locally/ It seems CPU is the default on MAC, you can try by specifying only the version not the cpu or gpu

GemmaTuron commented 1 year ago

Hello @karthikjetty

Please do provide an update.

karthikjetty commented 1 year ago

After I got my Ersilia to work, I have tried to install the model, but I keep getting the same error as before.

Steps I've done to attempt to solve this issue:

1) I have opened a new User on my laptop and downloaded a fresh Ersilia onto my computer, but have still received the same error. 2) I have tried downloading torch onto my system, and then fetching this model, but that doesn't seem to work. 3) The hint suggestion was to go into the model environment and download the package. I did this, but I'm not sure if this does anything, because when I go to fetch the model again it deletes the old environment that is there, which includes

One thing I did notice is that when I try running "pip install torch==1.6.0+cpu -f https://download.pytorch.org/whl/torch_stable.html" I get an error on my system and nothing gets downloaded.

I went back through some old repositories and tried fetching models that had "install torch==1.x.x+cpu". I came across eos5505 and eos8ykt and tried fetching these modesl, but received the same error that I currently have when fetching both this model and the other model I am supposed to be testing.

I wonder if we changed the Dockerfile to just say "RUN pip install torch==1.6.0 -f https://download.pytorch.org/whl/torch_stable.html" instead of "RUN pip install torch==1.6.0+cpu -f https://download.pytorch.org/whl/torch_stable.html", if that would fix the error that I am facing (although it would mess with all Windows Users).

GemmaTuron commented 1 year ago

Hi @karthikjetty

Can you try the following: Download the repository in your system (via git clone) change the installation instructions and try it with the --repo-path flag, to make sure your proposed solution works in your system.

@pauline-banye if we do not add the +cpu it does not work on linux right?

paulinebanye commented 1 year ago

Hi

Hi @karthikjetty

Can you try the following: Download the repository in your system (via git clone) change the installation instructions and try it with the --repo-path flag, to make sure your proposed solution works in your system.

@pauline-banye if we do not add the +cpu it does not work on linux right?

Hi @GemmaTuron, during the initial testing stage, I did not add the +cpu, it is actually not specified in the original code either so it might function without it. I haven't tested the code without it recently.

GemmaTuron commented 1 year ago

can you try it without @pauline-banye and let me know if in your system is required the cpu flag?

@DhanshreeA and @carcablop can you recall which other models needed the +cpu flag? so we can test if they have the same problem in Karthik's system - this will be a mac wide issue then

paulinebanye commented 1 year ago

can you try it without @pauline-banye and let me know if in your system is required the cpu flag?

@DhanshreeA and @carcablop can you recall which other models needed the +cpu flag? so we can test if they have the same problem in Karthik's system - this will be a mac wide issue then

Hi @GemmaTuron , it works without the +cpu flag. I was able to run the codes successfully within the local environment and the Ersilia CLI. eos74bo_fetch.log

carcablop commented 1 year ago

Hi @GemmaTuron For reference, I remember that the eos5505 model has the +cpu flag. https://github.com/ersilia-os/eos5505/blob/main/Dockerfile#L7

GemmaTuron commented 1 year ago

@karthikjetty can you please test eos5505 and let us know if you have the same issue with SKLearn?

Thanks!

carcablop commented 1 year ago

In the eos22io model, the cpu is also specified in this way to install pytorch: https://github.com/ersilia-os/eos22io/blob/main/Dockerfile#L10 I wonder if this could also fail for @karthikjetty .

DhanshreeA commented 1 year ago

@GemmaTuron all the ImageMol models have the cpu flag:

@karthikjetty I'm guessing none of these work on your Mac?

karthikjetty commented 1 year ago

Yeah. I will try the repo cloning solution once I can run Ersilia on my system. Meanwhile, maybe another mac user can test the models and see if they get the same error.

Also, before my Ersilia system failed, I looked through the models and tested eos5505 (since it had the +cpu) and it failed.