Bump xgboost from 1.5.2 to 1.6.0

Bumps xgboost from 1.5.2 to 1.6.0.

Release notes

Release 1.6.0 stable

v1.6.0 (2022 Apr 16)

After a long period of development, XGBoost v1.6.0 is packed with many new features and improvements. We summarize them in the following sections starting with an introduction to some major new features, then moving on to language binding specific changes including new features and notable bug fixes for that binding.

Development of categorical data support

This version of XGBoost features new improvements and full coverage of experimental categorical data support in Python and C package with tree model. Both hist, approx and gpu_hist now support training with categorical data. Also, partition-based categorical split is introduced in this release. This split type is first available in LightGBM in the context of gradient boosting. The previous XGBoost release supported one-hot split where the splitting criteria is of form x \in {c}, i.e. the categorical feature x is tested against a single candidate. The new release allows for more expressive conditions: x \in S where the categorical feature x is tested against multiple candidates. Moreover, it is now possible to use any tree algorithms (hist, approx, gpu_hist) when creating categorical splits. For more information, please see our tutorial on categorical data, along with examples linked on that page. (#7380, #7708, #7695, #7330, #7307, #7322, #7705, #7652, #7592, #7666, #7576, #7569, #7529, #7575, #7393, #7465, #7385, #7371, #7745, #7810)

In the future, we will continue to improve categorical data support with new features and optimizations. Also, we are looking forward to bringing the feature beyond Python binding, contributions and feedback are welcomed! Lastly, as a result of experimental status, the behavior might be subject to change, especially the default value of related hyper-parameters.

Experimental support for multi-output model

XGBoost 1.6 features initial support for the multi-output model, which includes multi-output regression and multi-label classification. Along with this, the XGBoost classifier has proper support for base margin without to need for the user to flatten the input. In this initial support, XGBoost builds one model for each target similar to the sklearn meta estimator, for more details, please see our quick introduction.

(#7365, #7736, #7607, #7574, #7521, #7514, #7456, #7453, #7455, #7434, #7429, #7405, #7381)

External memory support

External memory support for both approx and hist tree method is considered feature complete in XGBoost 1.6. Building upon the iterator-based interface introduced in the previous version, now both hist and approx iterates over each batch of data during training and prediction. In previous versions, hist concatenates all the batches into an internal representation, which is removed in this version. As a result, users can expect higher scalability in terms of data size but might experience lower performance due to disk IO. (#7531, #7320, #7638, #7372)

Rewritten approx

... (truncated)

Changelog

Sourced from xgboost's changelog.

v1.6.0 (2022 Apr 16)

After a long period of development, XGBoost v1.6.0 is packed with many new features and improvements. We summarize them in the following sections starting with an introduction to some major new features, then moving on to language binding specific changes including new features and notable bug fixes for that binding.

Development of categorical data support

This version of XGBoost features new improvements and full coverage of experimental categorical data support in Python and C package with tree model. Both hist, approx and gpu_hist now support training with categorical data. Also, partition-based categorical split is introduced in this release. This split type is first available in LightGBM in the context of gradient boosting. The previous XGBoost release supported one-hot split where the splitting criteria is of form x \in {c}, i.e. the categorical feature x is tested against a single candidate. The new release allows for more expressive conditions: x \in S where the categorical feature x is tested against multiple candidates. Moreover, it is now possible to use any tree algorithms (hist, approx, gpu_hist) when creating categorical splits. For more information, please see our tutorial on categorical data, along with examples linked on that page. (#7380, #7708, #7695, #7330, #7307, #7322, #7705, #7652, #7592, #7666, #7576, #7569, #7529, #7575, #7393, #7465, #7385, #7371, #7745, #7810)

In the future, we will continue to improve categorical data support with new features and optimizations. Also, we are looking forward to bringing the feature beyond Python binding, contributions and feedback are welcomed! Lastly, as a result of experimental status, the behavior might be subject to change, especially the default value of related hyper-parameters.

Experimental support for multi-output model

XGBoost 1.6 features initial support for the multi-output model, which includes multi-output regression and multi-label classification. Along with this, the XGBoost classifier has proper support for base margin without to need for the user to flatten the input. In this initial support, XGBoost builds one model for each target similar to the sklearn meta estimator, for more details, please see our quick introduction.

(#7365, #7736, #7607, #7574, #7521, #7514, #7456, #7453, #7455, #7434, #7429, #7405, #7381)

External memory support

External memory support for both approx and hist tree method is considered feature complete in XGBoost 1.6. Building upon the iterator-based interface introduced in the previous version, now both hist and approx iterates over each batch of data during training and prediction. In previous versions, hist concatenates all the batches into an internal representation, which is removed in this version. As a result, users can expect higher scalability in terms of data size but might experience lower performance due to disk IO. (#7531, #7320, #7638, #7372)

Rewritten approx

The approx tree method is rewritten based on the existing hist tree method. The rewrite closes the feature gap between approx and hist and improves the performance. Now the behavior of approx should be more aligned with hist and gpu_hist. Here is a list of user-visible changes:

... (truncated)

Commits

f75c007 Make 1.6.0 release. (#7813)
816e788 [backport] #7808 #7810 (#7811)
3ee3b18 [doc] fix a typo in jvm/index.rst (#7806) [skip ci] (#7807)
ece4dc4 [backport] Backport jvm changes to 1.6. (#7803)
67298cc [backport] Backport JVM fixes and document update to 1.6 (#7792)
78d2312 [CI] Enable faulthandler to show details when 0xC0000005 error occurs (#7771)
4615fa5 Drop support for deprecated CUDA architecture. (#7767)
4bd5a33 Make rc1 release. (#7764)
9150fdb Support pandas nullable types. (#7760)
d479648 Fix failures on R hub and Win builder. (#7763)
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)

BayesWitnesses / m2cgen

Bump xgboost from 1.5.2 to 1.6.0 #514

Release 1.6.0 stable

v1.6.0 (2022 Apr 16)

Development of categorical data support

Experimental support for multi-output model

External memory support

Rewritten approx

v1.6.0 (2022 Apr 16)

Development of categorical data support

Experimental support for multi-output model

External memory support

Rewritten approx