Closed csrookie-zoe closed 1 month ago
The LightGBM model file holds feature information using two attributes: feature_names
and feature_infos
.
If some feature is present in the training dataset, but is actually not used by the model, then its feature_info
is set to none
value (all features that are used have non-none
values).
If the JPMML-LightGBM converter find a none
value, then it leaves this model schema entry blank. By convention, this is represented by a Java's null
reference.
This NPE can only happen if there is a conflict between the feature_infos
attribute and the actual model content. The former says that a feature is not used by the model, but the latter actually attempts to generate a split based on this feature.
How to solve this error?
Never saw this kind of NPE myself.
My first intuition is that the LightGBM model file is corrupted - did you modify it manually in any way? Also, the model fragment that you pasted above looks incorrect, because there is no feature_infos
attribute at all.
I'd need to see a complete LightGBM model file to give more meaningful answers.
The LightGBM model file holds feature information using two attributes:
feature_names
andfeature_infos
.If some feature is present in the training dataset, but is actually not used by the model, then its
feature_info
is set tonone
value (all features that are used have non-none
values).If the JPMML-LightGBM converter find a
none
value, then it leaves this model schema entry blank. By convention, this is represented by a Java'snull
reference.This NPE can only happen if there is a conflict between the
feature_infos
attribute and the actual model content. The former says that a feature is not used by the model, but the latter actually attempts to generate a split based on this feature.How to solve this error?
Never saw this kind of NPE myself.
My first intuition is that the LightGBM model file is corrupted - did you modify it manually in any way? Also, the model fragment that you pasted above looks incorrect, because there is no
feature_infos
attribute at all.I'd need to see a complete LightGBM model file to give more meaningful answers.
Sorry, for data sensitivity reasons, I can't provide the full txt file, I can only tell you verbally what I'm having trouble with. I went to check the specifics within the txt file and there is content within the feature_infos, it is something like ‘feature_infos=[-999:65][-999:40]...... ’ And there is no None value in feature_infos, I don't know what else to check, but I need your help very much, thanks!
Sorry, for data sensitivity reasons, I can't provide the full txt file,
Can you reproduce this NPE using some public (toy-) dataset?
I went to check the specifics within the txt file and there is content within the feature_infos, it is something like
feature_infos=[-999:65][-999:40]......
The only way how this NPE to happen is that there are null
elements inside the Schema#getFeatures()
feature list. And the only way how null
elements can get in there is by having none
elements in the feature_infos
attribute.
Specifically, see this: https://github.com/jpmml/jpmml-lightgbm/blob/1.5.4/pmml-lightgbm/src/main/java/org/jpmml/lightgbm/GBDT.java#L205-L206
I don't know what else to check
You have full access to the JPMML-LightGBM source code at GitHub. And it looks to me that you've already successfully downloaded and built a binary version of it (because you're using a 1.5-SNAPSHOT
snapshot version, not any of my pre-built point release versions).
Now, feel free to insert System.out.prinln(...)
statements into it in order to pinpoint the location where a null
element gets inserted into Schema#getFeatures()
list.
I need your help very much, thanks!
I can't help you without a LightGBM model file.
You either reproduce the issue with a new dataset that can be shared, or you debug it locally using System.out.println(...)
statements.
Sorry, for data sensitivity reasons, I can't provide the full txt file,
Can you reproduce this NPE using some public (toy-) dataset?
I went to check the specifics within the txt file and there is content within the feature_infos, it is something like
feature_infos=[-999:65][-999:40]......
The only way how this NPE to happen is that there are
null
elements inside theSchema#getFeatures()
feature list. And the only way hownull
elements can get in there is by havingnone
elements in thefeature_infos
attribute.Specifically, see this: https://github.com/jpmml/jpmml-lightgbm/blob/1.5.4/pmml-lightgbm/src/main/java/org/jpmml/lightgbm/GBDT.java#L205-L206
I don't know what else to check
You have full access to the JPMML-LightGBM source code at GitHub. And it looks to me that you've already successfully downloaded and built a binary version of it (because you're using a
1.5-SNAPSHOT
snapshot version, not any of my pre-built point release versions).Now, feel free to insert
System.out.prinln(...)
statements into it in order to pinpoint the location where anull
element gets inserted intoSchema#getFeatures()
list.I need your help very much, thanks!
I can't help you without a LightGBM model file.
You either reproduce the issue with a new dataset that can be shared, or you debug it locally using
System.out.println(...)
statements.
Thank you so much for your advice! I recheck my txt file,and i really find a 'none' element in feature_infos! Only one 'none' element in it. What can I do next, can I remove this ‘none’ element from feature_infos?
I recheck my txt file,and i really find a 'none' element in feature_infos!
TOLD YOU SO!
What can I do next, can I remove this ‘none’ element from feature_infos?
You must not delete it (because that would mess up the indexing of feature_names
and feature_infos
attributes).
Instead, find out what is the name of this feature (feature_names
and feature_infos
are two lists with equal number of elements), and then choose one action:
none
element with a meaningful feature specification. Pay attention to the operational type (categorical vs continuous).It is still interesting that a column whose LigthGBM feature info is none
is referenced by the LightGBM model during tree splitting. This should never happen.
If anyone can reproduce this issue (ie. a NPE), and share a model, I'd be very much interested!
Decided to reopen this issue, because it is very unprofessional to have my library raise an NPE.
Will replace null
element with a org.jpmml.lightgbm.NullFeature
object, which will then raise a proper/meaningful error when it is being attempted to use for tree splitting.
For some reason, this issue reminds me of https://github.com/jpmml/jpmml-lightgbm/issues/63
A null
element should never be hit during tree splitting. There is something extra happening, which causes a "misalignment" of feature accesses.
Perhaps the none
feature info represents a categorical feature? If so, the value of the pandasCategoryIndex
variable could be off by one in JPMML-LightGBM versions that are older than 1.5.4.
I trained a lightgbm model and saved this model as .txt file. Then I use the command to convert this txt file to pmml file , but fail.
Command as follow:
ERROR INFO:
But in my txt model file, feature is not NULL , the content is :
How to solve this error?