Scalable, Portable and Distributed Gradient Boosting (GBDT, GBRT or GBM) Library, for Python, R, Java, Scala, C++ and more. Runs on single machine, Hadoop, Spark, Dask, Flink and DataFlow
I am running some experimentation on a dataset with roughly 300 features and around 300k datapoints. There are roughly 50 integer variables, representing a range of one-hot encoded, label encoded and ordinal numerical data.
When I convert all integer columns from integer to float before fitting my model, I see a significant reduction in model predictiveness on the test set.
Can anyone shed some light on why this might be? I can't find anything when performing a web search. I'm running XGBoost 2.0.3 using the sklearn API.
I am running some experimentation on a dataset with roughly 300 features and around 300k datapoints. There are roughly 50 integer variables, representing a range of one-hot encoded, label encoded and ordinal numerical data.
When I convert all integer columns from integer to float before fitting my model, I see a significant reduction in model predictiveness on the test set.
Can anyone shed some light on why this might be? I can't find anything when performing a web search. I'm running XGBoost 2.0.3 using the sklearn API.