jpmml / jpmml-lightgbm

Java library and command-line application for converting LightGBM models to PMML
GNU Affero General Public License v3.0
174 stars 58 forks source link

Fail to convert lightgbm to pmml (NumberFormatException) #20

Closed azaaza0319 closed 5 years ago

azaaza0319 commented 5 years ago

tried to run java -jar jpmml-lightgbm-executable-1.2-SNAPSHOT.jar --lgbm-input lightgbm.txt --pmml-output output.pmml to convert lightgbm model to pmml format, but encountered

Exception in thread "main" java.lang.NumberFormatException: null
    at java.lang.Integer.parseInt(Integer.java:542)
    at java.lang.Integer.parseInt(Integer.java:615)
    at org.jpmml.lightgbm.Section.getInt(Section.java:51)
    at org.jpmml.lightgbm.Tree.load(Tree.java:75)
    at org.jpmml.lightgbm.GBDT.load(GBDT.java:111)
    at org.jpmml.lightgbm.LightGBMUtil.loadGBDT(LightGBMUtil.java:59)
    at org.jpmml.lightgbm.LightGBMUtil.loadGBDT(LightGBMUtil.java:51)
    at org.jpmml.lightgbm.Main.run(Main.java:124)
    at org.jpmml.lightgbm.Main.main(Main.java:117)

Attached is the lightgbm model file. lightgbm.txt

lightgbm version is 2.0.2.

Can someone please kindly help? Thanks much!

vruusmann commented 5 years ago

lightgbm version is 2.0.2.

LightGBM 2.0.2 is such an outdated version (nearly two years old). Have you tried any newer LightGBM versions such as 2.1.2 or 2.2.2?

Also, how was the model trained? Using standalone LightGBM API/command-line application, or using Scikit-Learn wrapper. There might be difference between frontends.

Anyway, the JPMML-LightGBM includes integration tests, and they are all passing cleanly: https://github.com/jpmml/jpmml-lightgbm/blob/master/src/test/resources/main.py

What are you doing differently?

azaaza0319 commented 5 years ago

Thanks @vruusmann . The model is trained and maintained by others, so it is not likely to be re-trained under a newer LightGBM version. And the model was trained by using python Training API (not scikit-learn wrapper).

Additionally, I tried to add num_cat=0 for each tree and re-converted it, but got another error

Exception in thread "main" java.lang.NullPointerException
    at org.jpmml.lightgbm.Tree.selectValues(Tree.java:225)
    at org.jpmml.lightgbm.Tree.encodeNode(Tree.java:151)
    at org.jpmml.lightgbm.Tree.encodeNode(Tree.java:186)
    at org.jpmml.lightgbm.Tree.encodeNode(Tree.java:187)
    at org.jpmml.lightgbm.Tree.encodeNode(Tree.java:186)
    at org.jpmml.lightgbm.Tree.encodeTreeModel(Tree.java:94)
    at org.jpmml.lightgbm.ObjectiveFunction.createMiningModel(ObjectiveFunction.java:66)
    at org.jpmml.lightgbm.BinomialLogisticRegression.encodeMiningModel(BinomialLogisticRegression.java:49)
    at org.jpmml.lightgbm.GBDT.encodeMiningModel(GBDT.java:287)
    at org.jpmml.lightgbm.GBDT.encodePMML(GBDT.java:276)
    at org.jpmml.lightgbm.Main.run(Main.java:131)
    at org.jpmml.lightgbm.Main.main(Main.java:117)

Do you have any suggestions? Thanks!

vruusmann commented 5 years ago

I tried to add num_cat=0 for each tree and re-converted it,

The expression num_cat = 0 suggests that the model does not specify any categorical splits, but as the stack trace shows, this suggestion is wrong.

I would advise checking out some older JPMML-LightGBM version (something from April-May 2017, such as tags 1.0.7, 1.0.8 or 1.0.9), and try to "hack" these. The idea is that the current codebase follows the thought of LightGBM 2.2.X, and is too complicated for early models.

azaaza0319 commented 5 years ago

Got it. Thanks! Will take a look at the previous versions.