Installation dependency issues

ghost commented 4 years ago

Hi,

I tried to install / run the code but had quite a few issues, detailed below.

Summary: Could you please provide a clean working version suitable for a Python 3.x environment? (Attached is a modified requirements.txt which achieves this)

So here are the issues and the steps taken to resolve them:

$ pip install -r requirements.txt

ERROR: Could not find a version that satisfies the requirement tf_nightly_gpu_2.0_preview==2.0.0.dev20190814 (from -r requirements.txt (line 12)) (from versions: none)
ERROR: No matching distribution found for tf_nightly_gpu_2.0_preview==2.0.0.dev20190814 (from -r requirements.txt (line 12))

=> I installed tensorflow-gpu 2.0.0 which is the closest official release date-wise after this nightly dev build

Again:

$ pip install -r requirements.txt

Next Error:

    ERROR: Command errored out with exit status 1:
     command: [...]/bin/python -c 'import sys, setuptools, tokenize; sys.argv[0] = '"'"'/tmp/pip-install-ghekx1r3/skimage/setup.py'"'"'; __file__='"'"'/tmp/pip-install-ghekx1r3/skimage/setup.py'"'"';f=getattr(tokenize, '"'"'open'"'"', open)(__file__);code=f.read().replace('"'"'\r\n'"'"', '"'"'\n'"'"');f.close();exec(compile(code, __file__, '"'"'exec'"'"'))' egg_info --egg-base /tmp/pip-pip-egg-info-zyydlwgc
         cwd: /tmp/pip-install-ghekx1r3/skimage/
    Complete output (3 lines):
    *** Please install the `scikit-image` package (instead of `skimage`) ***

=> scikit-image was already in requirements.txt, so I just removed skimage -> seems like skimage is discontinued

Again:

$ pip install -r requirements.txt

Next Error:

ERROR: Could not find a version that satisfies the requirement tensorflow-gpu==2.0.0 (from versions: 2.2.0rc1, 2.2.0rc2, 2.2.0rc3, 2.2.0rc4, 2.2.0, 2.2.1, 2.3.0rc0, 2.3.0rc1, 2.3.0rc2, 2.3.0, 2.3.1)
ERROR: No matching distribution found for tensorflow-gpu==2.0.0

By looking at the actual files available on PyPI for tensorflow-gpu I eventually realized this was because I was using Python 3.8 and the latest builds available of tensorflow-gpu 2.0.0 are only up to and including Python 3.7.

So I installed Python 3.7 instead and tried again. Package deps install success! I then tried to run crop_img.py on a couple of prepared PNG images. Next error:

$ python3 crop_img.py -f ../DIAG_Repos/cxr-deep-covid-xr/data -U ./trained_unet_model.hdf5 -o /tmp
Using TensorFlow backend.
processing data in ../data
segmenting (0, 4) files of folder ../data
Running using 1 process
loading 132269_st008_se000_1.2.528.1.1008.706.1094735697.1.1.1.png
loading 13219_st000_se001_1.3.12.2.1107.5.4.4.1163.30000016072006282132800000229.png
### Dataset loaded
X shape =(4, 256, 256, 1)    raw_resized shape = (4, 256, 256)
    X:-0.8-2.3

    X.mean = 1.1796119636642288e-16, X.std = 1.0
X shape =  (4, 256, 256, 1)
Process Process-1:
Traceback (most recent call last):
  File "[...]/lib/python3.7/multiprocessing/process.py", line 297, in _bootstrap
    self.run()
  File "[...]/lib/python3.7/multiprocessing/process.py", line 99, in run
    self._target(*self._args, **self._kwargs)
  File "crop_img.py", line 509, in lungseg_one_process
    im_shape = im_shape, cut_thresh = cut_thresh, out_pad_size = out_pad_size, debug_folder = debug_folder, debugging = debug)
  File "crop_img.py", line 455, in lungseg_fromdata
    UNet = load_model(UNet)
  File "[...]/lib/python3.7/site-packages/keras/engine/saving.py", line 492, in load_wrapper
    return load_function(*args, **kwargs)
  File "[...]/lib/python3.7/site-packages/keras/engine/saving.py", line 584, in load_model
    model = _deserialize_model(h5dict, custom_objects, compile)
  File "[...]/lib/python3.7/site-packages/keras/engine/saving.py", line 273, in _deserialize_model
    model_config = json.loads(model_config.decode('utf-8'))
AttributeError: 'str' object has no attribute 'decode'
1 processes takes 1.240255355834961 seconds

For Python 3.x, this is correct -> the str.decode() method was removed in Python 3.x and only present in Python 2.x. Thus the keras version is still 2.7 (probably from the github install?) and this is deep in keras and is used for loading the stored dictionary of model weights.

I thus tried a re-install of everything this time with Python 2.7.19. First warning:

DEPRECATION: Python 2.7 will reach the end of its life on January 1st, 2020. Please upgrade your Python as Python 2.7 won't be maintained after that date. A future version of pip will drop support for Python 2.7. More details about Python 2 support in pip, can be found at https://pip.pypa.io/en/latest/development/release-process/#python-2-support

So I guess it might be a good idea to update to 3.x anyway?

Then I tried

$ pip install -r requirements.txt

which gave the following error:

ERROR: Could not find a version that satisfies the requirement pandas==0.25.0 (from -r requirements.txt (line 1)) (from versions: 0.1, 0.2b0, 0.2b1, 0.2, 0.3.0b0, 0.3.0b2, 0.3.0, 0.4.0, 0.4.1, 0.4.2, 0.4.3, 0.5.0, 0.6.0, 0.6.1, 0.7.0rc1, 0.7.0, 0.7.1, 0.7.2, 0.7.3, 0.8.0rc1, 0.8.0rc2, 0.8.0, 0.8.1, 0.9.0, 0.9.1, 0.10.0, 0.10.1, 0.11.0, 0.12.0, 0.13.0, 0.13.1, 0.14.0, 0.14.1, 0.15.0, 0.15.1, 0.15.2, 0.16.0, 0.16.1, 0.16.2, 0.17.0, 0.17.1, 0.18.0, 0.18.1, 0.19.0rc1, 0.19.0, 0.19.1, 0.19.2, 0.20.0rc1, 0.20.0, 0.20.1, 0.20.2, 0.20.3, 0.21.0rc1, 0.21.0, 0.21.1, 0.22.0, 0.23.0rc2, 0.23.0, 0.23.1, 0.23.2, 0.23.3, 0.23.4, 0.24.0rc1, 0.24.0, 0.24.1, 0.24.2)
ERROR: No matching distribution found for pandas==0.25.0 (from -r requirements.txt (line 1))

I went back to Python 3.7 and re-installed all packages as before. Then I simply removed by hand the calls to str.decode() in keras/saving.py to see if I could get the program running, which it did, at least until the next error:

  File "crop_img.py", line 433, in single_img_crop
    io.imsave(str(out_path), lung_img )
  File "[...]/lib/python3.7/site-packages/skimage/io/_io.py", line 144, in imsave
    return call_plugin('imsave', fname, arr, plugin=plugin, **plugin_args)
  File "[...]/lib/python3.7/site-packages/skimage/io/manage_plugins.py", line 210, in call_plugin
    return func(*args, **kwargs)
  File "[...]/lib/python3.7/site-packages/imageio/core/functions.py", line 303, in imwrite
    writer = get_writer(uri, format, "i", **kwargs)
  File "[...]/lib/python3.7/site-packages/imageio/core/functions.py", line 227, in get_writer
    "Could not find a format to write the specified file in %s mode" % modename
ValueError: Could not find a format to write the specified file in single-image mode

The suffix ".png" was being used, so instead of allowing im.save() to work out which format based on filename, I explicitly requested PNG format:

  File "crop_img.py", line 433, in single_img_crop
    io.imsave(str(out_path), lung_img)
=>
    io.imsave(str(out_path), lung_img, "PNG")

which gave a little more detailed information:

    raise ValueError("Plugin %s not found." % plugin)
ValueError: Plugin PNG not found.

It seems the PNG plugin is simply not available in the current package installs for the 2.x version of keras + accompanying pillow version.

So I modified the code to save bitmap (".bmp") instead of PNG, which got me a little further until the next error:

  File "crop_img.py", line 467, in lungseg_fromdata
    raw_img = raw_images[i],
IndexError: list index out of range

At which point I concluded this is just a programming error - I will raise it in a separate issue.

One last try: I cleaned the environment and kept to the package list but allowed a full clean install of all packages suitable for and consistent with a Python 3.7 environment. This installed fine and it ran through with none of the dependency errors above. It still errored on IndexError: list index out of range, which I assume is a straight programming error rather than any dependency package error.

Attached are both a modified requirements.txt and a dump of pip_freeze.txt to show the full install environment. I attached them as tickets to this issue rather than submitting a pull request as I wasn't sure if (1) I am able to push a branch to your repo, and (2) you would actually want me to push a branch (non-master, of course).

Regards, Gaby

rmw362 commented 3 years ago

Hi Gaby @GabyRumc ,

I am happy to provide some limited support now, but we are awaiting final publication of our manuscript before we provide ongoing full support. Thank you for pointing out some dependency issues. You are correct that there are redundant dependencies in the requirementst.txt file. As you mentioned tensorflow-gpu==2.0.0 can be used rather than the nightly preview version. Tensorflow-gpu 2.0.0 was used due to CUDA limitations on our local server. Accordingly, as you discovered, python version 3.7 must be used for compatibility and we have now indicated this clearly in the README file. We also removed the redundant and incorrect "skimage" dependency as this is already covered by scikit-image in the requirements file. Note the only major difference between the example requirements.txt file you provided and our updated file is the version for keras vis. Your file contains vis>=0.0.4, however, in order for gradCAM visualization to work correctly you must use the latest keras-vis version by installing directly from github (git+https://github.com/raghakot/keras-vis).

No adjustments to str.decode in the source code are necessary and the preferred file format when using this application is PNG format. Regarding the "IndexError: list index out of range" error in crop_img.py, this has to do with normalization of the path name, rather than a programming error (this is addressed in more detail in the corresponding issue you raised separately).

Thanks very much and let us know if this does not resolve your issue.

ghost commented 3 years ago

Thanks!

No adjustments to str.decode in the source code are necessary

was in the keras package - but no need if the packages are now updated

Note the only major difference between the example requirements.txt file you provided and our updated file is the version for keras vis

A file diff says differently ;-) but ok...

version for keras vis. Your file contains vis>=0.0.4

Yes, this is the vis package - latest 0,0,5, not keras-vis. In any case, PyPI only has v 0.4.1 (for keras-vis) so I will keep to pulling an unreleased version from github as you specify, even if that isn't a formally released version.

I found a workable set of packages based around tensorflow-gpu 2.2.0, So this version should also be fine, given what you said about tensoflow-gpu 2.0.0 - or not?

ghost commented 3 years ago

No adjustments to str.decode in the source code are necessary

Not in your source code, no. Using your updated requirements.txt with tensorflow(-gpu)=2.0.0 I get the following for both a Python 3.6 and a Python 3.7 environment:

Using TensorFlow backend.
Loading models...
python-BaseException
Traceback (most recent call last):
  File "[...]site-packages/tensorflow_core/python/keras/saving/hdf5_format.py", line 166, in load_model_from_hdf5
    model_config = json.loads(model_config.decode('utf-8'))
AttributeError: 'str' object has no attribute 'decode'

The decode() method on class str was removed in Python 3 -> suggests build / package incompatibilities still present.

Bunch of other incompatibilites etc, won't list them all here.

Given there is no specific reason to stay with tensorflow 2.0.0 I'll stay with tensorflow(-gpu)==2.2.0 in requirements.txt, which also has the nice side-effect of being able to run on my GPU which tf 2.0.0 cannot (my GPU drivers not compatible that far back for tf 2.0.x but tf 2.2.x runs fine on my GPU with nvidia drivers v450)

ghost commented 3 years ago

Staying with tensorflow 2.2.0 gives the following other incompatibilities with your requirements.txt:

tensorflow-gpu 2.2.0 requires h5py<2.11.0,>=2.10.0, but you'll have h5py 3.1.0 which is incompatible.
tensorflow-gpu 2.2.0 requires scipy==1.4.1; python_version >= "3", but you'll have scipy 1.5.4 which is incompatible.
tensorflow 2.2.0 requires h5py<2.11.0,>=2.10.0, but you'll have h5py 3.1.0 which is incompatible.
tensorflow 2.2.0 requires scipy==1.4.1; python_version >= "3", but you'll have scipy 1.5.4 which is incompatible.

Please note

Tensorflow 2.2.0 expects earlier versions of the packages than the ones installed via your requirements.txt. For tf 2.0. this must also be the case - and maybe even earlier versions of some packages might be required for tf 2.0.0.
Neither h5py nor scipy were in your updated requirements.txt, so I re-added them (they were in my version of requirements.txt) with the version specifiers as listed above and all now installs correctly.

rmw362 commented 3 years ago

OK, I am going to try a fresh build of the environment from scratch and report back to you. As you know dependencies can be tricky especially when working on different workstations with different OS/NVIDIA drivers/CUDA version etc. Some of these issues seem to be OS/workstation independent, however (ie missing dependencies entirely in requirements file). I will get back to you in the next 48 hours

ghost commented 3 years ago

FYI my current requirements.txt which installs (pip install -r) correctly with tensorflow(-gpu)-2.2.0 : requirements.txt

ghost commented 3 years ago

OK, I am going to try a fresh build of the environment from scratch and report back to you. As you know dependencies can be tricky especially when working on different workstations with different OS/NVIDIA drivers/CUDA version etc. Some of these issues seem to be OS/workstation independent, however (ie missing dependencies entirely in requirements file). I will get back to you in the next 48 hours

I closed this for now as I have posted my requirements.txt which works on tf 2.2.0 for me. Feel free to re-open if you find other issues. Yep, packaging can be tricky... some of the issues I had were definitely tf / keras package compatibility issues - seems to be a lot on the internet about such packaging problems.

rmw362 commented 3 years ago

Hi @GabyRumc ,

I reopened to update you on my testing of pip install of the requirements file on different operating systems with different hardware specifications. Unfortunately, there are occasional issues depending on your setup/environment which are impossible to account for with a requirements.txt file alone. To make the process as easy as possible, we have therefore included another option of pulling a docker image from a DockerHub repository that should work. There are two versions: a "large" version which is an exact copy of our production environment for our platform, and a "small" version which I put together using only required dependencies which has been tested some but not as extensively as the large version. You can now find details on how to pull these docker images from DockerHub in the readme.

Please let me know if you are still having environment issues after this.

ghost commented 3 years ago

there are occasional issues depending on your setup/environment which are impossible to account for with a requirements.txt file alone

Agreed - requirements.txt is only half the story.

we have therefore included another option of pulling a docker image from a DockerHub repository that should work

Thanks. docker pull rwehbe/deepcovidxr:large - picked it up, will let you know how I get on.

ghost commented 3 years ago

Will close this one as using the docker now - but the docker needed some modifications (e.g. copying the Python source from this repo to it) to get the docker working - will do a separate issue/PR for this

IVPLatNU / DeepCovidXR

Installation dependency issues #1