jpmml / jpmml-lightgbm

Java library and command-line application for converting LightGBM models to PMML
GNU Affero General Public License v3.0
174 stars 58 forks source link

NPE for a combination of categorical features and empty trees #44

Closed yuzhixing closed 3 years ago

yuzhixing commented 3 years ago

the real model contains some invalid trees, example the tree has not contain any values of split_feature. so NPE will happen when trans GBDT to PMML, exactly NPE will occuer in “org.jpmml.lightgbm.Tree.isBinary(int feature)” lined 319.

vruusmann commented 3 years ago

Related to #41.

The org.jpmml.lightgbm.Tree class exposes some public API methods that directly access nullable fields.

These methods are not invoked with the Iris dataset (all continuous features). The integration testing suite should include a test case for a mixed schema (continuous plus categorical features), where the LightGBM model itself is badly over-fitted so that there are empty trees present in the LightGBM file.

vruusmann commented 3 years ago

@yuzhixing You should re-train your model so that it did not contain any empty trees!

Empty trees are a sign of an over-fitted model. You don't want to be deploying over-fitted models in real world application scenarios.

In a sense, this JPMML-LightGBM library bug just helped you to avoid a potentially very costly mistake!

yuzhixing commented 3 years ago

thanks, you are right。