databricks / containers

Sample base images for Databricks Container Services
Apache License 2.0
167 stars 118 forks source link

Pillow/numpy dependency conflicts in updated images #195

Open kamtingtsoi opened 3 months ago

kamtingtsoi commented 3 months ago

I am using databricksruntime/standard:11.3-LTS and regularly build new images from this base image. The images are then used to start a Databricks cluster for data processing. Starting from 2 days ago, the cluster refused to execute anything with the below error:

Traceback (most recent call last):
  File "/databricks/python_shell/scripts/db_ipykernel_launcher.py", line 92, in <module>
    app.shell.run_line_magic('matplotlib', 'inline')
  File "/databricks/python/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 2407, in run_line_magic
    result = fn(*args, **kwargs)
  File "/databricks/python/lib/python3.9/site-packages/decorator.py", line 232, in fun
    return caller(func, *(extras + args), **kw)
  File "/databricks/python/lib/python3.9/site-packages/IPython/core/magic.py", line 187, in <lambda>
    call = lambda f, *a, **k: f(*a, **k)
  File "/databricks/python/lib/python3.9/site-packages/IPython/core/magics/pylab.py", line 99, in matplotlib
    gui, backend = self.shell.enable_matplotlib(args.gui.lower() if isinstance(args.gui, str) else args.gui)
  File "/databricks/python/lib/python3.9/site-packages/IPython/core/interactiveshell.py", line 3600, in enable_matplotlib
    from matplotlib_inline.backend_inline import configure_inline_support
  File "/databricks/python/lib/python3.9/site-packages/matplotlib_inline/__init__.py", line 1, in <module>
    from . import backend_inline, config  # noqa
  File "/databricks/python/lib/python3.9/site-packages/matplotlib_inline/backend_inline.py", line 6, in <module>
    import matplotlib
  File "/databricks/python/lib/python3.9/site-packages/matplotlib/__init__.py", line 107, in <module>
    from . import _api, cbook, docstring, rcsetup
  File "/databricks/python/lib/python3.9/site-packages/matplotlib/rcsetup.py", line 24, in <module>
    from matplotlib import _api, animation, cbook
  File "/databricks/python/lib/python3.9/site-packages/matplotlib/animation.py", line 34, in <module>
    from PIL import Image
  File "/databricks/python/lib/python3.9/site-packages/PIL/Image.py", line 68, in <module>
    from ._typing import StrOrBytesPath, TypeGuard
  File "/databricks/python/lib/python3.9/site-packages/PIL/_typing.py", line 10, in <module>
    NumpyArray = npt.NDArray[Any]
AttributeError: module 'numpy.typing' has no attribute 'NDArray'

Upon inspection I found out that the library pillow==10.4.0 was updated from 10.3.0, this now requires numpy>=1.21.0 (but it is still kept at 1.20.3). I could install my own numpy pinned version to override the defaults, but anyone using the vanilla image on a Databricks cluster will render the cluster unusable.

p.s. the sha for the mentioned image is 232630f809512f5831540e072fa2a5ca3096139f11bdabe6812fbb3eeb9d78bf

xinzhao-db commented 1 week ago

@joveyuan-db

joveyuan-db commented 5 days ago

Thanks for the report. I am able to reproduce the issue and you're correct that the issue seems to be caused by pillow==10.4.0. Upon further inspection, this was brought in by matplotlib==3.4.3 here via https://github.com/matplotlib/matplotlib/blob/919145fe9849c999aa491457c6de6faede5959c3/setup.py#L309, which likely got pulled in around 7/21 when we re-built and published the 11.3-LTS image. Let me discuss with the team on how we would like to approach this.