qe-team / marmot

MARMOT - the open source framework for feature extraction and machine learning, designed to estimate the quality of Machine Translation output
ISC License
21 stars 7 forks source link

how to add feature names for extractors which return a variable number of features? #9

Open chrishokamp opened 9 years ago

chrishokamp commented 9 years ago

some feature extractors return a variable number of features depending how they are initialized. An example would be a vocabulary feature extractor which returns a |V| sized binary vector where |V| is the size of the vocabulary. Of course, the number of features in this vector depends upon the vocabulary that it is initialized with.

For these cases, get_feature_names should dynamically generate the feature names -- or this could (should) be done at initialization time to avoid the overhead of computing the names on the fly every time the extractor is called.

varvara-l commented 9 years ago

There should be a collection of feature names filled during initialisation, if possible, or in get_features function. If there is no way to initialise it before (if the number of features depends on the parameters passed to get_features), get_feature_names should return an error if called before get_features, i.e. if the collection of names hasn't been initialised yet.