Add feature names to the dataset export command

metarank / metarank

A low code Machine Learning personalized ranking service for articles, listings, search results, recommendations that boosts user engagement. A friendly Learn-to-Rank engine

Apache License 2.0

2.08k stars 88 forks source link

I'm hoping that you can add a mapping to the feature names when I use the dataset export command.

When I created a new model, I could map the feature indices to the feature names since the order was the same as in the config file. For example, the first row of train.svm looks like:

0 qid:123456 1:1.0 2:41.0 3:3.0 4:61.0 5:1.0

However, when I retrained this model and changed the features in the model and the training data, the train.svm file looked more like:

0 qid:123456 1:1.0 2:41.0 6:22.0 7:1.0 9:21.0

Given that the index of a feature no longer corresponds to the feature's name in the config file, I'm finding it difficult identify each feature.

Ideally, train.svm would look like:

0 qid:123456 feature_a:1.0 feature_b:41.0 feature_c:22.0 feature_d:1.0 feature_e:21.0

But I'd also be happy with something like a json file that has the mapping:

{
  "1": "feature_a",
  "2": "feature_b",
  "6": "feature_c",
  "7": "feature_d",
  "9": "feature_e",
}

metarank / metarank

Add feature names to the dataset export command #1312