OHDSI / PatientLevelPrediction

An R package for performing patient level prediction in an observational database in the OMOP Common Data Model.
https://ohdsi.github.io/PatientLevelPrediction
185 stars 88 forks source link

How about changing the default gradient boosting algorithm to lightGBM? #340

Open choi328328 opened 1 year ago

choi328328 commented 1 year ago

Hello, I'm Jin Choi from Ajou university.

Currently, the default gradient boosting machine algorithm is xgboost. I think we can use lightGBM as default, which was developed by Microsoft and cited more than 5000 times. (https://proceedings.neurips.cc/paper/2017/hash/6449f44a102fde848669bdd9eb6b76fa-Abstract.html)

LightGBM is known for 10x faster training times than xgboost while maintaining almost the same model performance. You can find comparisons between xgboost and lightGBM in the below link. (https://neptune.ai/blog/xgboost-vs-lightgbm)

Also, LightGBM has an official R package. (https://lightgbm.readthedocs.io/en/latest/R/index.html)

I think this change could accelerate model development with limited computational resources.

jreps commented 1 year ago

Hi Jin Choi, I love the idea of adding in lightGBM!

Did you want to help add it in (no worries if not, but I wanted to see whether I could support you if you do)? To add a new classifier, we need a set function that specifies the hyper-parameters that users can search and seed, a fit function that takes the settings and data then fits the model and possibly a predict function that takes new data and the model to return the predictions. More details can be found here: https://ohdsi.github.io/PatientLevelPrediction/articles/AddingCustomModels.html

choi328328 commented 1 year ago

Of course. I will upload the code after developing it.

ChungsooKim commented 1 year ago

Commits have been requested to merge. The next steps are:

  1. Finding the best parameter sets for the default value
  2. Test with the Strategus