MiraldiLab / maxATAC

Transcription Factor Binding Prediction from ATAC-seq and scATAC-seq with Deep Neural Networks
Apache License 2.0
25 stars 8 forks source link

Docker and install #65

Closed FaizRizvi closed 2 years ago

FaizRizvi commented 2 years ago

I am running into an install issue and would like some input.

On the refactor branch I can install maxatac after creating a virtualenv using the following commands:

python3.9 -m venv test_env

or

conda create test_env python=3.9

then install the requirements with: pip3 install -r py3.9_requirements.txt

This works, typing the command maxatac shows that install was successful.

The problem arises when I need to create a docker image from a dockerfile (https://github.com/MiraldiLab/maxATAC/blob/refactor/Dockerfile).

Running: docker build -t maxatac_docker .

shows that the docker image is not created:

Screen Shot 2021-10-18 at 1 40 42 PM

this is due to the following line in py3.9_requirements.txt: -e git+https://github.com/MiraldiLab/maxATAC.git@a3fb0c965e7d5323fa35d22b467434eeff7b658e#egg=maxatac

If I remove this egg-info, the image can be made. The removal of this line prevents maxatac install with conda or virtualenv. Does anyone have a workaround for this issue?

michael-kotliar commented 2 years ago

Hi Faiz, Take a look at the following line of your docker build logs

 > [5/5] RUN pip3 install -r py3.9_requirements.txt:
#8 1.322 Obtaining maxatac from git+https://github.com/MiraldiLab/maxATAC.git@a3fb0c965e7d5323fa35d22b467434eeff7b658e#egg=maxatac (from -r py3.9_requirements.txt (line 22))
#8 1.322   Cloning https://github.com/MiraldiLab/maxATAC.git (to revision a3fb0c965e7d5323fa35d22b467434eeff7b658e) to ./src/maxatac
#8 1.323   Running command git clone -q https://github.com/MiraldiLab/maxATAC.git /maxATAC/src/maxatac
#8 1.535   fatal: could not read Username for 'https://github.com': No such device or address
#8 1.539 WARNING: Discarding git+https://github.com/MiraldiLab/maxATAC.git@a3fb0c965e7d5323fa35d22b467434eeff7b658e#egg=maxatac. Command errored out with exit status 128: git clone -q https://github.com/MiraldiLab/maxATAC.git /maxATAC/src/maxatac Check the logs for full command output.

When running inside docker, your GitHub username and login are unknown, so it's not possible to get the code of maxATAC.

I would remove

-e git+https://github.com/MiraldiLab/maxATAC.git@a3fb0c965e7d5323fa35d22b467434eeff7b658e#egg=maxatac

line from the py3.9_requirements.txt file and install maxATAX separately by running pip3 install . in the maxATAC folder that you ADDed.

Some of the dockerfile examples can be found here

michael-kotliar commented 2 years ago

Also, try this one but remove everything that are not Python packages from the py3.9_requirements.txt.

FROM python:3.9
WORKDIR /tmp
ADD ./* maxATAC
RUN cd maxATAC && pip3 install . -c py3.9_requirements.txt
CMD ["maxatac"]

Then check the build logs and see which not Python components should be installed separately (not with pip3).

FaizRizvi commented 2 years ago

Thank you and andrew for the suggestions! I was able to manage the work around. I have another question:

pip3 install . Fails to install maxatac and I still used pip3 install -e .

I remember there is a change needed in setup.py I think. What would I need to change?

michael-kotliar commented 2 years ago

usually, you would use -e option only when you are planning to update python code in place during development. In all other cases, you pip3 install name_of_the_package or .

michael-kotliar commented 2 years ago

Feel free to reopen it if needed

FaizRizvi commented 2 years ago

I tried the following:

conda create maxatac_R_Thresh
conda activate maxatac_R_Thresh
pip3 install -r requirements
pip3 install .

and got the following error message:

(maxatac_R_Thresh) ➜ maxatac git:(refactor_threshold) maxatac Traceback (most recent call last): File "/Users/war9qi/opt/anaconda3/envs/maxatac_R_Thresh/bin/maxatac", line 6, in from maxatac.utilities.logger import setup_logger ModuleNotFoundError: No module named 'maxatac'

I then reinstall with

pip3 install -e .

and it works.

FaizRizvi commented 2 years ago

Additionally,

pip3 install -c requirements 

does not install the requirements for some reason

FaizRizvi commented 2 years ago

I was able to test upload maxatac to test.pypi.org. I tried to install on a different computer doing the following:

pip install -r py3.9requirements.txt then

pip install -i https://test.pypi.org/simple/ maxatac==1.0.20211104162003

It builds maxatac but spits the same error as before:

(maxatac_test) [rizvifw@owens-login04 maxATAC]$ maxatac Traceback (most recent call last): File "/users/PES0808/rizvifw/.conda/envs/maxatac_test/bin/maxatac", line 6, in from maxatac.utilities.logger import setup_logger ModuleNotFoundError: No module named 'maxatac.utilities'

michael-kotliar commented 2 years ago

As we discussed today, the solution is simple - just to add __init__.py in every directory from where you want to import python files:)

FaizRizvi commented 2 years ago

Thank you! worked like a charm!

FaizRizvi commented 2 years ago

Now I am having trouble with pypi based install with the following commands and outputs:

(base) ➜ maxATAC git:(refactor) conda activate test8 (test8) ➜ maxATAC git:(refactor) pip install -i https://test.pypi.org/simple/ maxatac -c py3.9_requirements.txt

Looking in indexes: https://test.pypi.org/simple/ Collecting maxatac Downloading https://test-files.pythonhosted.org/packages/b7/1d/09f748d3779019fc346b4bdd63e4a8d9e242c7d106c19e37ff9e53f9813e/maxatac-1.0.202111091409411114.tar.gz (45 kB) |████████████████████████████████| 45 kB 2.6 MB/s Downloading https://test-files.pythonhosted.org/packages/af/8b/b4420ebff6143e422b31954c4f05eb703f904bf6861e491f8e4cd3ef6d65/maxatac-1.0.202111091409411113.tar.gz (45 kB) |████████████████████████████████| 45 kB 6.2 MB/s ERROR: Cannot install maxatac==1.0.202111091409411113 and maxatac==1.0.202111091409411114 because these package versions have conflicting dependencies.

The conflict is caused by: maxatac 1.0.202111091409411114 depends on tensorflow maxatac 1.0.202111091409411113 depends on tensorflow>=2.5.0 The user requested (constraint) tensorflow==2.5.0

To fix this you could try to:

  1. loosen the range of package versions you've specified
  2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/user_guide/#fixing-conflicting-dependencies

FaizRizvi commented 2 years ago
michael-kotliar commented 2 years ago

Try to remove tensorboard from the setup.py.

tacazares commented 2 years ago

What steps are left for us to be able to do a pip install maxatac?

FaizRizvi commented 2 years ago

We need to publish the maxatac V_1.0.0 on pypi (not test pypi, it is working on test pypi). Probably should merge refactor into develop and develop into main and trim the other branches away before that.

tacazares commented 2 years ago

I think we need to make our software available on pypi now. We are essentially ready to publish on bioRxiv and submit for review. We need for our collaborators to easily review our code and package. We also need reviewers to be able to access our package. I vote that we review the code base and do a test of prepare, normalize, predict, peaks, average, and benchmark to make sure they are completely error free and hand off for testing by our collaborators.