Kaggle / docker-python

Kaggle Python docker image
Apache License 2.0
2.47k stars 957 forks source link

XGBoost (accidentally?) downgraded in most recent image #911

Closed marketneutral closed 3 years ago

marketneutral commented 3 years ago

A participant in the Jane Street competition noted the most recent image has XGBoost version 0.90. The image prior to that had 1.2.1. PyPI has 1.2.1.

https://www.kaggle.com/c/jane-street-market-prediction/discussion/201835

I don't see any commit that would intentionally do this. This is a significant issue.

A short term work around for users is to clone a notebook from before December 3rd, wipe it clean, and then use that for your work.

rosbo commented 3 years ago

Context: For the master branch, we do builds after each commit and once every other day.

It was downgraded after no specific commit on Dec 2.

Will look into which package caused the downgrade.

rosbo commented 3 years ago

Looks like dask-xgboost is causing the downgrade:

Dec 02 13:09:54 Collecting dask-xgboost
Dec 02 13:09:54   Downloading dask_xgboost-0.1.11-py2.py3-none-any.whl (13 kB)
Dec 02 13:09:54 Requirement already satisfied: dask in /opt/conda/lib/python3.7/site-packages (from dask-xgboost->dask-ml[xgboost]) (2.14.0)
...
Dec 02 13:10:04 Installing collected packages: xgboost, dask-glm, dask-xgboost, dask-ml
Dec 02 13:10:04   Attempting uninstall: xgboost
Dec 02 13:10:04     Found existing installation: xgboost 1.2.1
Dec 02 13:10:04     Uninstalling xgboost-1.2.1:
Dec 02 13:10:06       Successfully uninstalled xgboost-1.2.1
Dec 02 13:10:11 Successfully installed dask-glm-0.2.0 dask-ml-1.7.0 dask-xgboost-0.1.11 xgboost-0.90
rosbo commented 3 years ago

dask-xgboost requires a version of xgboost <= 0.90: https://github.com/dask/dask-xgboost/blob/master/requirements.txt#L1

rosbo commented 3 years ago

I will remove dask-xgboost from our image. Usage is low on Kaggle (~100 kernel run used it vs in the millions for xgboost over the last 90 days).

marketneutral commented 3 years ago

Amazingly quick response! Thanks so much!

rosbo commented 3 years ago

The new Docker image should be released to Kaggle notebooks tomorrow or Wednesday.

rosbo commented 3 years ago

The new image has been released to Kaggle notebooks.

I confirmed that opening a Kaggle notebooks using the "Latest" environment has the proper xgboost version:

Name: xgboost
Version: 1.2.1
Summary: XGBoost Python Package
Home-page: https://github.com/dmlc/xgboost
Author: None
Author-email: None
License: Apache-2.0
Location: /opt/conda/lib/python3.7/site-packages
Requires: numpy, scipy
Required-by: TPOT