kr-colab / diploSHIC

feature-based deep learning for the identification of selective sweeps
MIT License
50 stars 14 forks source link

Please document dependency version requirements. #38

Open molpopgen opened 3 years ago

molpopgen commented 3 years ago

This package is currently rather difficult to get working due to tf/keras issues.

Installing current versions leads to import errors when training:

python ../../submodules/diploSHIC/diploSHIC.py train train test model
2021-06-08 14:43:18.605122: W tensorflow/stream_executor/platform/default/dso_loader.cc:64] Could not load dynamic library 'libcudart.so.11.0'; dlerror: libcudart.so.11.0: cannot open shared object file: No such file or directory
2021-06-08 14:43:18.605144: I tensorflow/stream_executor/cuda/cudart_stub.cc:29] Ignore above cudart dlerror if you do not have a GPU set up on your machine.
Traceback (most recent call last):
  File "../../submodules/diploSHIC/diploSHIC.py", line 126, in <module>
    from keras.utils.layer_utils import convert_all_kernels_in_model
ImportError: cannot import name 'convert_all_kernels_in_model' from 'keras.utils.layer_utils' (/home/molpopgen/src/polygenic_classification/prototype/single_core_training/training/lib/python3.8/site-packages/keras/utils/layer_utils.py)

This is for the following versions in a clean venv:

pip show keras
Name: Keras
Version: 2.4.3
Summary: Deep Learning for humans
Home-page: https://github.com/keras-team/keras
Author: Francois Chollet
Author-email: francois.chollet@gmail.com
License: MIT
Location: /home/molpopgen/src/polygenic_classification/prototype/single_core_training/training/lib/python3.8/site-packages
Requires: numpy, pyyaml, h5py, scipy
Required-by: 
(training)
pip show tensorflow
Name: tensorflow
Version: 2.5.0
Summary: TensorFlow is an open source machine learning framework for everyone.
Home-page: https://www.tensorflow.org/
Author: Google Inc.
Author-email: packages@tensorflow.org
License: Apache 2.0
Location: /home/molpopgen/src/polygenic_classification/prototype/single_core_training/training/lib/python3.8/site-packages
Requires: h5py, tensorflow-estimator, typing-extensions, flatbuffers, astunparse, protobuf, keras-nightly, google-pasta, six, tensorboard, wheel, keras-preprocessing, wrapt, gast, absl-py, grpcio, termcolor, numpy, opt-einsum
Required-by: 

Google searches suggest that the import error is fixed by installing earlier versions of tf. However, pip seems to reject that now:

pip --version
pip 20.1.1 from /home/molpopgen/src/polygenic_classification/prototype/single_core_training/training/lib/python3.8/site-packages/pip (python 3.8)
pip install tensorflow==1.15.2
ERROR: Could not find a version that satisfies the requirement tensorflow==1.15.2 (from versions: 2.2.0rc1, 2.2.0rc2, 2.2.0rc3, 2.2.0rc4, 2.2.0, 2.2.1, 2.2.2, 2.3.0rc0, 2.3.0rc1, 2.3.0rc2, 2.3.0, 2.3.1, 2.3.2, 2.4.0rc0, 2.4.0rc1, 2.4.0rc2, 2.4.0rc3, 2.4.0rc4, 2.4.0, 2.4.1, 2.5.0rc0, 2.5.0rc1, 2.5.0rc2, 2.5.0rc3, 2.5.0)
ERROR: No matching distribution found for tensorflow==1.15.2

pip install tensorflow-cpu==1.15.2
ERROR: Could not find a version that satisfies the requirement tensorflow-cpu==1.15.2 (from versions: 2.2.0rc1, 2.2.0rc2, 2.2.0rc3, 2.2.0rc4, 2.2.0, 2.2.1, 2.2.2, 2.3.0rc0, 2.3.0rc1, 2.3.0rc2, 2.3.0, 2.3.1, 2.3.2, 2.4.0rc0, 2.4.0rc1, 2.4.0rc2, 2.4.0rc3, 2.4.0rc4, 2.4.0, 2.4.1, 2.5.0rc0, 2.5.0rc1, 2.5.0rc2, 2.5.0rc3, 2.5.0)
ERROR: No matching distribution found for tensorflow-cpu==1.15.2

The solution to this is to provide a requirements.txt that you have verified to work.

jdaron commented 2 years ago

I have been ruining into the same problem, and I highly recommend to create a virtual environment to run some of the sub program of diploSHIC.

Here is the workflow I've done:

  1. create a new ven with python <= 3.7 since the version of keras and tensorflow compatible with diploSHIC do not work with higher python version conda create -n ven python=3.6 anaconda conda activate ven

  2. install the appropriate version of tensorflow and keras pip install tensorflow==1.15.2 pip install keras==2.4.3

  3. you're all set to run the diploSHIC.py train and diploSHIC.py predict tools. Note that I only activate the ven to run those two tools and I run the rest of diploSHIC using my normal environment: Python 3.8.8 numpy 1.21.4 scipy 1.6.2 pandas 1.2.4 scikit-allel 1.3.2 scikit-learn 0.24.1

molpopgen commented 2 years ago

It is possible for it all to work in a single venv, but it was non-trivial to work out. The following works on Python 3.8 and I believe also 3.9:

scikit-allel
# scipy
scipy==1.4.1
matplotlib
keras==2.4.3
tensorflow-cpu<=2.3
sklearn

I've used this on several machines with Pop OS 20.04 and 21.04, which are basically the same as the Ubuntu's with the same version numbers.

andrewkern commented 2 years ago

hey @molpopgen and @jdaron -- i'm currently working on #39 which cleans up these issues and will provide a clean pip install. i'll close this issue for now and move conversation to that PR if it suits you all ok

andrewkern commented 2 years ago

reopening this issue as @molpopgen points out #39 doesn't provide a sufficient fix to the issues here

andrewkern commented 2 years ago

i'd encourage you guys to check out the new pip package and confirm that it works on your systems. it should be able to be installed on a clean virtual environment simply with pip install diploshic.

so far i've been able to confirm that this install works on a mac systems and linux

jdaron commented 2 years ago

Dear Andrew, I've been testing the new version of diploSHIC, which I installed using pip install diploshic. However I am still getting an error while launching diploSHIC fvecSim diploid (see bellow). I generated this error with numpy==1.19.5 while with numpy==1.21.4 (pip install -U numpy) I don't get this error and the program is running smoothly. Do you think it could be a problem with my conda environement ?

RuntimeError: module compiled against API version 0xe but this version of numpy is 0xd Traceback (most recent call last): File "/home/daron/anaconda3/bin/makeFeatureVecsForSingleMsDiploid.py", line 5, in from diploshic.msTools import File "/home/daron/anaconda3/lib/python3.8/site-packages/diploshic/init.py", line 1, in from diploshic.fvTools import File "/home/daron/anaconda3/lib/python3.8/site-packages/diploshic/fvTools.py", line 7, in import diploshic.shicstats as dps ImportError: numpy.core.multiarray failed to import

molpopgen commented 2 years ago

). I generated this error with numpy==1.19.5 while with numpy==1.21.4 (pip install -U numpy) I don't get this error and the program is running smoothly.

Aha-- this is important. The error is due to something being compiled against the numpy C API. This API is fickle. @andrewkern -- you probably have to pin the package to the exact numpy versions used, else the wheels will be very fragile.

andrewkern commented 2 years ago

Dear Andrew, I've been testing the new version of diploSHIC, which I installed using pip install diploshic. However I am still getting an error while launching diploSHIC fvecSim diploid (see bellow). I generated this error with numpy==1.19.5 while with numpy==1.21.4 (pip install -U numpy) I don't get this error and the program is running smoothly. Do you think it could be a problem with my conda environement ?

RuntimeError: module compiled against API version 0xe but this version of numpy is 0xd Traceback (most recent call last): File "/home/daron/anaconda3/bin/makeFeatureVecsForSingleMsDiploid.py", line 5, in from diploshic.msTools import File "/home/daron/anaconda3/lib/python3.8/site-packages/diploshic/init.py", line 1, in from diploshic.fvTools import File "/home/daron/anaconda3/lib/python3.8/site-packages/diploshic/fvTools.py", line 7, in import diploshic.shicstats as dps ImportError: numpy.core.multiarray failed to import

hi @jdaron -- yes I think this is a conda issue. in likelihood you have conflicting package versions in that old environment. can you try this again in a clean (brand new) conda install? i.e.,

conda create -n test_diploshic python=3.9 --yes
conda activate test_diploshic
pip install diploshic
jdaron commented 2 years ago

I've been running the whole diploshic workflow in a virtual env and everything run smoothly, thanks a lot for making those updates.

andrewkern commented 2 years ago

awesome-- great to hear.

@molpopgen is this working on your systems too now?

molpopgen commented 2 years ago

awesome-- great to hear.

@molpopgen is this working on your systems too now?

Haven't had time to test. Will try to in a few days.

andrewkern commented 2 years ago

hey @molpopgen -- should we close this issue?

molpopgen commented 2 years ago

No -- setup.py doesn't address this issues. There's no version numbers pinned/ranges pinned for tensorflow/keras.

Also, the version numbers for this package make it difficult/impossible to realistically used. Adding a 3 to 0..3333 when bumping is still the same semver, so a user has a hard time pinning to anything other than exact versions.