great work
Why do tree-based models still outperform deep
learning on tabular data?
but can you recommend data set for mixed continues and categorical features for binary classification
with big data size - lets say 10 million rows and 80 features ?
when
1
features are not independent - for example some features have dependencies on several other features ?
2
unbalanced data - much more NO labels than YES labels
great work Why do tree-based models still outperform deep learning on tabular data?
but can you recommend data set for mixed continues and categorical features for binary classification with big data size - lets say 10 million rows and 80 features ?
when 1 features are not independent - for example some features have dependencies on several other features ?
2 unbalanced data - much more NO labels than YES labels
like https://www.kaggle.com/competitions/amex-default-prediction/data
https://github.com/jxzly/Kaggle-American-Express-Default-Prediction-1st-solution