Kaggle / docker-python

Kaggle Python docker image
Apache License 2.0
2.47k stars 955 forks source link

CUML is broken on latest Kaggle environment (e.g. May 2024) #1392

Closed datancoffee closed 5 months ago

datancoffee commented 6 months ago

Repro:

Add import statement for cuml to a notebook:

import cuml

Error (short summary):

ImportError Traceback (most recent call last) Cell In[1], line 1 ----> 1 import cuml

ImportError: cannot import name 'is_datetime64tz_dtype' from 'pandas.core.tools.datetimes' (/opt/conda/lib/python3.10/site-packages/pandas/core/tools/datetimes.py)

Repro notebook:

https://www.kaggle.com/code/datancoffee/repro-cuml-does-not-import-on-latest-kaggle/

Other similar bugs:

https://www.kaggle.com/discussions/product-feedback/481085

calderjo commented 6 months ago

thanks for reporting this! was able to reproduce on a new notebook, will report back once i have a lead on what's causing the issue.

fscm commented 6 months ago

Doing !pip check on a notebook with the latest environment I get:

keras-cv 0.8.2 requires keras-core, which is not installed.
keras-nlp 0.9.3 requires keras-core, which is not installed.
tensorflow-decision-forests 1.8.1 requires wurlitzer, which is not installed.
apache-beam 2.46.0 has requirement dill<0.3.2,>=0.3.1.1, but you have dill 0.3.8.
apache-beam 2.46.0 has requirement numpy<1.25.0,>=1.14.3, but you have numpy 1.26.4.
apache-beam 2.46.0 has requirement pyarrow<10.0.0,>=3.0.0, but you have pyarrow 15.0.2.
beatrix-jupyterlab 2023.128.151533 has requirement jupyterlab~=3.6.0, but you have jupyterlab 4.1.6.
boto3 1.26.100 has requirement botocore<1.30.0,>=1.29.100, but you have botocore 1.34.69.
cloud-tpu-client 0.10 has requirement google-api-python-client==1.8.0, but you have google-api-python-client 2.126.0.
conda 24.3.0 has requirement packaging>=23.0, but you have packaging 21.3.
google-cloud-aiplatform 0.6.0a1 has requirement google-api-core[grpc]<2.0.0dev,>=1.22.2, but you have google-api-core 2.11.1.
google-cloud-automl 1.0.1 has requirement google-api-core[grpc]<2.0.0dev,>=1.14.0, but you have google-api-core 2.11.1.
jupyterlab 4.1.6 has requirement jupyter-lsp>=2.0.0, but you have jupyter-lsp 1.5.1.
jupyterlab-lsp 5.1.0 has requirement jupyter-lsp>=2.0.0, but you have jupyter-lsp 1.5.1.
kfp 2.5.0 has requirement google-cloud-storage<3,>=2.2.1, but you have google-cloud-storage 1.44.0.
libpysal 4.9.2 has requirement packaging>=22, but you have packaging 21.3.
libpysal 4.9.2 has requirement shapely>=2.0.1, but you have shapely 1.8.5.post1.
momepy 0.7.0 has requirement shapely>=2, but you have shapely 1.8.5.post1.
osmnx 1.9.2 has requirement shapely>=2.0, but you have shapely 1.8.5.post1.
pytoolconfig 1.3.1 has requirement packaging>=23.2, but you have packaging 21.3.
spopt 0.6.0 has requirement shapely>=2.0.1, but you have shapely 1.8.5.post1.
tensorflow 2.15.0 has requirement keras<2.16,>=2.15.0, but you have keras 3.2.1.
tensorstore 0.1.56 has requirement ml-dtypes>=0.3.1, but you have ml-dtypes 0.2.0.
textblob 0.18.0.post0 has requirement nltk>=3.8, but you have nltk 3.2.4.
virtualenv 20.21.0 has requirement platformdirs<4,>=2.4, but you have platformdirs 4.2.0.
xarray 2024.3.0 has requirement packaging>=22, but you have packaging 21.3.
ydata-profiling 4.6.4 has requirement numpy<1.26,>=1.16.0, but you have numpy 1.26.4.

Maybe there are more/other packages broken (and causing the cuML error)?

calderjo commented 6 months ago

Maybe there are more/other packages broken (and causing the cuML error)?

most definitely there are more package that are broken, in part due our current tf2.15 and keras 3 setup as more package want tf2.16. We're still waiting for an newer base image for that upgrade.

i did notice that downgrading panda or upgrading rapid may help, will need to investigate if these break other important packages: https://www.kaggle.com/discussions/product-feedback/503237

calderjo commented 6 months ago

small update: we able to get latest v24 of rapids install in our image but it's incompatible with our p100 gpus due to being outdated: https://docs.rapids.ai/notices/rsn0034/

were still exploring ways to get a working version of rapids that works in all gpus types, thanks for your patients https://github.com/Kaggle/docker-python/pull/1395

datancoffee commented 6 months ago

Definitely appreciate your work on this. Thank you!

calderjo commented 6 months ago

Looks like we got a fix going, we're hoping to get a new release this week!

datancoffee commented 6 months ago

hurray! i might win this competition yet!

calderjo commented 5 months ago

we released a fix for this last week, will close this out.