Include an explanation of high cardinality, discussing how Cyclic boosting is strong against high cardinality data and the reasons for this, along with an example of a high cardinality case.
Conduct regression using three datasets with varying degrees of high cardinality and compare with LightGBM (LightGBM installation required).
The results should show that across all datasets, LightGBM is outperformed in terms of MAE and SMAPE, with the performance gap increasing as the degree of high cardinality becomes stronger.
Add three datasets with varying degrees of high cardinality under tests/high_cardinality_data/
Add a tutorial on high cardinality cases as an example