zenml-io / zenml

ZenML 🙏: The bridge between ML and Ops. https://zenml.io.
https://zenml.io
Apache License 2.0
3.94k stars 429 forks source link

[BUG]: Many of the existing packages in production conflicting with zenml dependencies hence unable to install #520

Closed aimlnerd closed 2 years ago

aimlnerd commented 2 years ago

Contact Details [Optional]

No response

What happened?

This is not a bug. The production project's model microservice requirements.txt contains the below packages

uvicorn==0.17.0
fastapi==0.73.0
fastapi-utils==0.2.1
starlette==0.17.1
anyio==3.5.0
scipy==1.7.3
python-dotenv==0.19.2
charset-normalizer==2.0.9
click==7.1.2
contextlib2==21.6.0
contextvars==2.4
cryptography==36.0.1
cymem==2.0.5
docx2txt==0.8
fleep==1.0.1
greenlet==1.1.2
h11==0.12.0
idna==2.8
img2pdf==0.4.3
immutables==0.16
importlib-metadata==4.8.3
iniconfig==1.1.1
itsdangerous==2.0.1
Jinja2==3.0.3
langdetect==1.0.9
loguru==0.5.3
lxml==4.7.1
MarkupSafe==2.0.1
murmurhash==1.0.6
numpy==1.22.1
opencv-python==4.5.3.56
packaging==21.3
pandas==1.4.0
pathy==0.6.1
pdf2image==1.16.0
Pillow==9.0.1
pdfminer==20191125
pdfminer.six==20211012
pikepdf==3.2.0
pluggy==1.0.0
poppler-utils==0.1.0
preshed==3.0.6
py==1.11.0
pyaml==21.10.1
pycparser==2.21
pycryptodome==3.12.0
pydantic==1.7.4
pyparsing==3.0.6
pytesseract==0.3.5
pytest==6.2.5
python-dateutil==2.8.2
python-Levenshtein-wheels==0.13.2
python-poppler==0.2.2
pytz==2021.3
serum==5.1.0
smart-open==5.2.1
sniffio==1.2.0
spacy==3.2.1
spacy-legacy==3.0.8
spacy-loggers==1.0.1
SQLAlchemy==1.4.28
srsly==2.4.2
toml==0.10.2
tqdm==4.62.3
azure-storage-blob==12.8.1
azure-core==1.21.1
python-multipart==0.0.5
asgiref==3.5.0
attrs==21.4.0
azure-core==1.21.1
blis==0.7.5
bpemb==0.3.3
catalogue==2.0.6
certifi==2021.10.8
cffi==1.15.0
chardet==4.0.0
fasteners==0.17.3
fasttext==0.9.2
gensim==4.1.2
isodate==0.6.1
langcodes==3.3.0
lz4==3.1.10
msrest==0.6.21
oauthlib==3.1.1
Poutyne==1.8
pybind11==2.9.0
pymagnitude-light==0.1.147
PyYAML==6.0
requests==2.27.1
requests-oauthlib==1.3.0
scipy==1.7.3
sentencepiece==0.1.96
six==1.16.0
spacy-loggers==1.0.1
thinc==8.0.13
typer==0.3.2
typing_extensions==4.0.1
urllib3==1.26.8
wasabi==0.9.0
xxhash==2.0.2
zipp==3.7.0
#Dependencies for comlib - START
html2text==2020.1.16
sarge==0.1.6
PyPDF2==1.26.0
tika==1.24
#Dependencies for comlib - END

I am unable to add zenml 0.7.1 to the above requirements.txt Because many of the existing package version conflicts with zenml. The main ones that conflict are spacy, pydantic, jinja etc.

In a simple example notebook, zenml works well but not able to add it to production due to this dependencies. When i install zenml==0.7.1 after running pip install -r requirements.txt Its uninstalling and installing new versions of the package.

I believe this is causing unexpected errors like below


INFO:     Started server process [24352]
INFO:uvicorn.error:Started server process [24352]
INFO:     Waiting for application startup.
INFO:uvicorn.error:Waiting for application startup.
INFO:     Application startup complete.
INFO:uvicorn.error:Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8081/ (Press CTRL+C to quit)
INFO:uvicorn.error:Uvicorn running on http://0.0.0.0:8081/ (Press CTRL+C to quit)
Creating run for pipeline: `inference_pipeline_crf`
Cache disabled for pipeline `inference_pipeline_crf`
import sys; print('Python %s on %s' % (sys.version, sys.platform))
 inference_pipeline.run(run_name=RUN_NAME)
Python 3.8.13 (default, Mar 28 2022, 11:38:47) 
Type 'copyright', 'credits' or 'license' for more information
IPython 8.2.0 -- An enhanced Interactive Python. Type '?' for help.
PyDev console: using IPython 8.2.0
Python 3.8.13 (default, Mar 28 2022, 11:38:47) 
[GCC 7.5.0] on linux
Using stack `default` to run pipeline `inference_pipeline_crf`...
Traceback (most recent call last):
  File "/home/deepak/anaconda3/envs/mira-models/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3369, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-f4355664dc5d>", line 1, in <cell line: 1>
    inference_pipeline.run(run_name=RUN_NAME)
  File "/home/deepak/anaconda3/envs/mira-models/lib/python3.8/site-packages/zenml/pipelines/base_pipeline.py", line 376, in run
    return stack.deploy_pipeline(
  File "/home/deepak/anaconda3/envs/mira-models/lib/python3.8/site-packages/zenml/stack/stack.py", line 404, in deploy_pipeline
    return_value = self.orchestrator.run_pipeline(
  File "/home/deepak/anaconda3/envs/mira-models/lib/python3.8/site-packages/zenml/orchestrators/local/local_orchestrator.py", line 78, in run_pipeline
    tfx_pipeline: TfxPipeline = create_tfx_pipeline(pipeline, stack=stack)
  File "/home/deepak/anaconda3/envs/mira-models/lib/python3.8/site-packages/zenml/orchestrators/utils.py", line 45, in create_tfx_pipeline
    zenml_pipeline.connect(**zenml_pipeline.steps)
  File "/home/deepak/projects/mira/mira-models/source/services/si/pipelines/inference_pipeline_crf.py", line 20, in inference_pipeline_crf
    text = get_pred_input()
  File "/home/deepak/anaconda3/envs/mira-models/lib/python3.8/site-packages/zenml/steps/base_step.py", line 634, in __call__
    self._component = component_class(
  File "/home/deepak/anaconda3/envs/mira-models/lib/python3.8/site-packages/tfx/dsl/component/experimental/decorators.py", line 64, in __init__
    super().__init__(spec)
  File "/home/deepak/anaconda3/envs/mira-models/lib/python3.8/site-packages/tfx/dsl/components/base/base_component.py", line 103, in __init__
    super().__init__(
  File "/home/deepak/anaconda3/envs/mira-models/lib/python3.8/site-packages/tfx/dsl/components/base/base_node.py", line 62, in __init__
    dsl_context_registry.get().put_node(self)
  File "/home/deepak/anaconda3/envs/mira-models/lib/python3.8/site-packages/tfx/dsl/context_managers/dsl_context_registry.py", line 158, in get
    return _registry_holder.current
AttributeError: '_thread._local' object has no attribute 'current'

Really liked the mlops features and ml pipeline features of zenml. Would be a disappointment to remove zenml related codes and go back to old way of coding without zenml pipelines for the sake of putting the model in production. Any solutions? Perhaps there is a better way to manage the dependencies.

Reproduction steps

1. 2. 3. ...

ZenML Version

0.7.1

Python Version

3.8

OS Type

No response

Relevant log output

No response

Code of Conduct

AsiaCao commented 2 years ago

I am a first-time user of zenml, who also encountered a similar issue. (Perhaps more related to M1 chipset in particular)

What I did? I just created a brand new virtual environment, and run pip install zenml, and pip failed complaining dependency conflicts.

Environment: Python 3.8, Apple M1 CPU

ERROR: Cannot install zenml==0.1.0, zenml==0.1.1, zenml==0.1.2, zenml==0.1.3, zenml==0.1.4, zenml==0.1.5, zenml==0.2.0, zenml==0.3.1, zenml==0.3.2, zenml==0.3.3, zenml==0.3.4, zenml==0.3.5, zenml==0.3.6, zenml==0.3.6.1, zenml==0.3.7 and zenml==0.3.8 because these package versions have conflicting dependencies.

The conflict is caused by:
    zenml 0.3.8 depends on tensorflow==2.4.1
    zenml 0.3.7 depends on tensorflow<2.4.0 and >=2.3.0
    zenml 0.3.6.1 depends on tensorflow<2.4.0 and >=2.3.0
    zenml 0.3.6 depends on tensorflow<2.4.0 and >=2.3.0
    zenml 0.3.5 depends on tensorflow<2.4.0 and >=2.3.0
    zenml 0.3.4 depends on tensorflow<2.4.0 and >=2.3.0
    zenml 0.3.3 depends on tensorflow<2.4.0 and >=2.3.0
    zenml 0.3.2 depends on tensorflow<2.4.0 and >=2.3.0
    zenml 0.3.1 depends on tensorflow<2.4.0 and >=2.3.0
    zenml 0.2.0 depends on tensorflow<2.4.0 and >=2.3.0
    zenml 0.1.5 depends on tensorflow<2.4.0 and >=2.3.0
    zenml 0.1.4 depends on tensorflow<2.4.0 and >=2.3.0
    zenml 0.1.3 depends on tensorflow<2.4.0 and >=2.3.0
    zenml 0.1.2 depends on tensorflow<2.4.0 and >=2.3.0
    zenml 0.1.1 depends on tensorflow==2.3.0
    zenml 0.1.0 depends on tensorflow==2.3.0

To fix this you could try to:
1. loosen the range of package versions you've specified
2. remove package versions to allow pip attempt to solve the dependency conflict

ERROR: ResolutionImpossible: for help visit https://pip.pypa.io/en/latest/topics/dependency-resolution/#dealing-with-dependency-conflicts

(zenml) ➜  zenml-playground pip list
Package    Version
---------- -------
pip        22.0.4
setuptools 62.1.0
wheel      0.37.1
htahir1 commented 2 years ago

@deepakiim Your issue is perhaps related to pydantic and spacy conflicting wth some core dependencies of ZenML. One way to figure this out is to let us know what are the minimum possible dependencies in your environment that you need. After that we can try to loosen some of our internal dependencies to help out.

@AsiaCao Your issue is a bit strange. We are at Version 0.7.2 and your pip seems not to find any version above 0.3.8. I am wondering how that is possible. You said you are on a M1 chipset and that is for sure not supported yet by ZenML (We internally are trying to solve this but it makes honestly it is hard as even other bigger packages like Tensorflow dont even have support yet)