ccao-data / model-res-avm

Automated valuation model for all class 200 residential properties in Cook County (except vacant land and condos)
GNU Affero General Public License v3.0
28 stars 5 forks source link

Test xgboost modeling engine #31

Open dfsnow opened 1 year ago

dfsnow commented 1 year ago

The Data Department recently performed some model benchmarking (ccao-data/report-model-benchmark) comparing the run times of XGBoost and LightGBM. We found that the current iteration of XGBoost runs much faster than LightGBM on most machines, while achieving similar performance.

We should test replacing LightGBM as the primary modeling engine in both models.

LightGBM

Pros

Cons

XGBoost

Pros

Cons

dfsnow commented 9 months ago

Definitely not going to happen this year. The XGBoost R package is in heavy development right now (and still doesn't have native categorical support like the Python package does). May be worth picking up in the spring. Performance between the two engines was extremely similar in untuned benchmarking.