Closed makquel closed 2 years ago
Hi @makquel, thank you for your interest in LightGBM. The features are saved in the same order they were used for training, i.e.:
import lightgbm as lgb
import numpy as np
X = np.random.rand(100, 3)
y = np.random.rand(100)
ds = lgb.Dataset(X, y, feature_name=['x2', 'x0', 'x1'])
bst = lgb.train({'num_leaves': 3, 'verbose': -1}, ds, num_boost_round=1)
print(bst.dump_model()['feature_names'])
# ['x2', 'x0', 'x1']
Are you able to provide a minimal reproducible example where this isn't the case?
This issue has been automatically closed because it has been awaiting a response for too long. When you have time to to work with the maintainers to resolve this issue, please post a new comment and it will be re-opened. If the issue has been locked for editing by the time you return to it, please open a new issue and reference this one. Thank you for taking the time to improve LightGBM!
This issue has been automatically locked since there has not been any recent activity since it was closed. To start a new related discussion, open a new issue at https://github.com/microsoft/LightGBM/issues including a reference to this.
Summary
When dumping a lgbm native model (command below) the refered model object comes with a generic ordinal numeration for column names.
The JSON schema that comes from dumping the model looks like this:
Would be very useful to have a native method to override features names to match exactly the ones used for training the model.
Motivation
Using the native model as PMML format could facilitate integration with other platforms and even for those who implement a custom
.predict
function.