Closed solutionjh closed 1 month ago
Use Case
- Did not use feature map when train model, so model.json files has item "feature_names":[],"feature_types":[].
- Model use 40 items.
These are two XGBoost model object states:
feature_names
and feature_types
fields defined at all. In JPMML-XGBoost, this state is mapped to feature_names = null
and feature_types = null
. This state happens with XGBoost 1.0 -- 1.5, if I remember correctly.feature_names = []
and feature_types = []
.Debugging result
- Learner.java:355 only check null
- feature_names and feature_types has empty array, so pass this check logic
- --fmap-input command-line option omitted
In case of incomplete embedded XGBoost model schema information, it is your responsibility to provide it externally, using the --fmap-input
command-line option.
Alternatively, you may edit the XGBoost model file programmatically, and set the values of feature_names
and feature_types
field to non-null/non-empty state. If I understand you correctly, then there are supposed to be 40 elements on each of them.
JPMML-XGBoost does not make any attempts to "guess" the model schema for you.
@solutionjh Please elaborate, what do you expect the JPMML-XGBoost converter to do instead of throwing an IOOBE.
How do you know that there are supposed to be 40 features? Why is this information not included into the XGBoost model file, why is it kept separate?
@vruusmann Thank you for your response.
I think Learner.java:355
needs additional check logic like this.feature_names.length == 0 || this.feature_types == 0
then user can use --fmap-input
option using fmap
file for update information.
In my case, delete feature_names=[], feature_types=[]
in model json, and use --fmap-input
option for update information.
Thanks for a great bridge module for python ML to java application!
then user can use --fmap-input option using fmap file for update information.
Updated the title of this issue accordingly - the problem is that the --fmap-input
does not have any effect (when feature_names = []
and feature_types = []
)?
Perhaps there should be an additional command-line flag for stating "ignore the embedded FMap, only use the user-provided FMap".
Updated the title of this issue accordingly - the problem is that the --fmap-input does not have any effect (when feature_names = [] and feature_types = [])?
Sure, XGBoost work well when feature_names
and feature_type
are empty.
Additional flag or empty check give good user experience.
Moreover, it is good for user to give a message about feature_map
and feature_type
are empty.
Have a nice time~~
jpmml-xgboost version: 1.8.5
XGBoost Version and Model
Use Case
"feature_names":[],"feature_types":[]
.Error Stack Trace
Debugging result
--fmap-input
command-line option omitted