Closed Currie32 closed 5 months ago
Metarank export format depends on the model used (so it's different for xgboost and lightgbm), and in practice tries to match it's quirks. SVM format is used for XGBoost and the issue is that the SVM file format assumes that the feature index is an index, not a name: https://stats.stackexchange.com/questions/61328/libsvm-data-format
If we change the format, then xgboost wont be able to load export data without extra transformation.
But if you switch from xgboost to lightgbm, then the format will switch to a CSV - and it includes column names.
I'm hoping that you can add a mapping to the feature names when I use the dataset export command.
When I created a new model, I could map the feature indices to the feature names since the order was the same as in the config file. For example, the first row of train.svm looks like:
However, when I retrained this model and changed the features in the model and the training data, the train.svm file looked more like:
Given that the index of a feature no longer corresponds to the feature's name in the config file, I'm finding it difficult identify each feature.
Ideally, train.svm would look like:
But I'd also be happy with something like a json file that has the mapping: