Building-ML-Pipelines / building-machine-learning-pipelines

Code repository for the O'Reilly publication "Building Machine Learning Pipelines" by Hannes Hapke & Catherine Nelson
MIT License
583 stars 250 forks source link

Unable to install requirements #45

Closed DSchmidtDev closed 2 years ago

DSchmidtDev commented 3 years ago

I am currently facing issues installing the given requirements of the projects.

I am on Mac OS X 11.1 and used python 3.6.12, 3.7.9 and 3.8.7.

pip tries to solve the dependency tree but isn't able to fulfill all requirements and takes multiple hours trying all combinations. I also tried fixing tensorflow==2.2.1 and let all other packages open but still cannot resolve the version dependencies.

There are different error messages and I do not wanted to paste all here. Maybe you can guide me to one working python version and I can then try again and paste error messages.

Thank you!

catherinenelson1 commented 3 years ago

Hi there, I was able to successfully install the requirements yesterday in a fresh virtual environment based on Python 3.7.8.

DSchmidtDev commented 3 years ago

I just installed Python 3.7.8 with pyenv, pip 21.0.1 and tried it again. After long time installing build dependencies I am getting the following error:

ERROR: Could not find a version that satisfies the requirement pandas==0.22.0
ERROR: No matching distribution found for pandas==0.22.0

Have you upgraded some versions locally? I thought pandas 0.22.0 is not supported in Python > 3.6

DSchmidtDev commented 3 years ago

Seems to be a Mac OS X 11.x Big Sur problem only.

Trying to get it running with latest Python version but that's not directly possible due to the old requirement versions.

ssbaghlaf2 commented 3 years ago

I'm running on Mojave 10.14.6 with virtualenv using python3.8 (same issue with 3.7 and 3.9). Tried updating to pandas 1.0.3 and still no luck

hanneshapke commented 3 years ago

Hi @ssbaghlaf2 , Can you please confirm that you experience the issues on a clean, new virtualenv?

Thank you, Hannes

ssbaghlaf2 commented 3 years ago

yes. I remove the virtualenv and try again with a clean one each time

ssbaghlaf2 commented 3 years ago

So I found two solutions:

  1. Changing pandas to a new version (I installed the latest 1.2.2) although I get a few ERRORS from pip saying that some versions are incompatible, but it installs everything The logs make it seem like a numpy issue. My guess is that pandas 0.22.0 depends on an old version of numpy and that causes the error for some reason. I tried installing an old version of numpy with pip install numpy==1.9.0 and got the same kind of error log

  2. install the latest versions of the packages in requirements.txt Then the installation process goes smoothly, but I have yet to see how that might conflict with the material in the repo

DSchmidtDev commented 3 years ago

@ssbaghlaf2 with mac OS X 11.2.1 , Python 3.8.7 and pip 21.0.1 I also got installed the packages in latest version. Let's see if we get everything running with it. Maybe we can create a PR afterwards ;)

gerold-csendes-epam commented 3 years ago

I experienced the same issues as @DSchmidtDev but on a Ubuntu 20.04.2 LTS running on WSL. I tried both with python 3.6.0 and 3.8.0 using pyenv but didn't suceed. Will try to go for 3.7.8. as described by @drcat101

gerold-csendes-epam commented 3 years ago

I did not succeed, here is the error I got: ERROR: No matching distribution found for pandas==0.22.0

Are there any workarounds?

@ssbaghlaf2 could you run the the repo code with the lates versions? The only thing I am afraid of that pandas had a major version change so this might cause some issues

ssbaghlaf2 commented 3 years ago

@gerold-csendes-epam Everything is working fine so far with latest versions of all packages.

DSchmidtDev commented 3 years ago

@gerold-csendes-epam I guess that pandas version is too old for your python/pip version. Try a newer one. I am also using the latest version of all packages an have no issues so far but had not enough time yet to go deeper

Anylee2142 commented 3 years ago

Hi, I was facing the same problem.

Setting pip to 20.2.0 and python 3.6.9 worked for me. (Ubuntu 18.04 with vanilla python + virtualenv)

It's a bit late for you guys but for those still working on it!

hanneshapke commented 2 years ago

Hi @DSchmidtDev & @Anylee2142,

Thank you for reporting this issue. Check out the latest updates to the example code: https://github.com/Building-ML-Pipelines/building-machine-learning-pipelines/releases/tag/examples_based_on_tfx_1.4

The dependency issue should be fixed with the latest update. Please reopen if you run into trouble. Thank you again for reporting the issue.