feedzai / feedzai-openml-java

Implementations for Feedzai's OpenML APIs to allow for usage of machine learning models in the Java programming language.
https://www.feedzai.com
Apache License 2.0
2 stars 11 forks source link

FairGBM openml API implementation #116

Closed AndreFCruz closed 2 years ago

AndreFCruz commented 2 years ago

Implementation of the Java openml API for the C++ FairGBM algorithm (to be open-sourced soon).

As FairGBM was built on top of LightGBM, I just hijacked the current LightGBM openml API implementation and added a new algorithm descriptor for the new FairGBM parameters.

Other considerations:

AndreFCruz commented 2 years ago

Note: this CI build fail is related to the FairGBM repo not being public as of yet; I'll have to change the make-lightgbm submodule to allow building from a local repo or using a token.

AndreFCruz commented 2 years ago

Also, everything seems to work regarding backwards compatibility when loading old LightGBM model files. We needed to test this because the model.txt representation changed with the bump in LightGBM version.

Tested it using the following notebook: test-lightgbm-retrocompatible-model-loading.zip

AndreFCruz commented 2 years ago

I'll review all this feedback today.

One other thing: we were asked to have FairGBM as the first model on the Pulse UI model choices, followed by LightGBM and every other model.

codecov[bot] commented 2 years ago

Codecov Report

Merging #116 (05165d9) into master (de77c62) will increase coverage by 0.30%. The diff coverage is 85.18%.

@@             Coverage Diff              @@
##             master     #116      +/-   ##
============================================
+ Coverage     79.90%   80.21%   +0.30%     
- Complexity      428      465      +37     
============================================
  Files            43       47       +4     
  Lines          1498     1607     +109     
  Branches        138      157      +19     
============================================
+ Hits           1197     1289      +92     
- Misses          224      231       +7     
- Partials         77       87      +10     
Impacted Files Coverage Δ
...enml/provider/lightgbm/LightGBMDescriptorUtil.java 96.77% <0.00%> (-0.20%) :arrow_down:
...ai/openml/provider/lightgbm/FairGBMMLProvider.java 40.00% <40.00%> (ø)
...i/openml/provider/lightgbm/LightGBMMLProvider.java 40.00% <50.00%> (+40.00%) :arrow_up:
...i/openml/provider/lightgbm/AlgoDescriptorUtil.java 66.66% <66.66%> (ø)
...tgbm/LightGBMBinaryClassificationModelTrainer.java 83.72% <82.69%> (-3.15%) :arrow_down:
...eedzai/openml/provider/lightgbm/SWIGTrainData.java 84.74% <84.00%> (-4.45%) :arrow_down:
...penml/provider/lightgbm/FairGBMDescriptorUtil.java 93.75% <93.75%> (ø)
...enml/provider/lightgbm/FairGBMParamParserUtil.java 100.00% <100.00%> (ø)
...i/openml/provider/lightgbm/LightGBMAlgorithms.java 100.00% <100.00%> (ø)
...openml/provider/lightgbm/LightGBMModelCreator.java 87.23% <100.00%> (ø)
... and 4 more

Continue to review full report at Codecov.

Legend - Click here to learn more Δ = absolute <relative> (impact), ø = not affected, ? = missing data Powered by Codecov. Last update de77c62...05165d9. Read the comment docs.

AndreFCruz commented 2 years ago

Are we ready to merge this? :tada:

gandola commented 2 years ago

LGTM! :1st_place_medal:

Cheers