dkpro / dkpro-tc

UIMA-based text classification framework built on top of DKPro Core and DKPro Lab.
https://dkpro.github.io/dkpro-tc/
Other
34 stars 19 forks source link

Introduce managed FeatureSet for speeding up processing #501

Closed Horsmann closed 6 years ago

Horsmann commented 6 years ago

The sparse feature mode could be speed up a bit by making the feature set a bit smarter and distinguishing internally the features that are actually set with a non-default value. This would shorten processing time a bit for larger datasets. This modifies the FeatureExtractor signature and uses no longer a Set<Feature>

Horsmann commented 6 years ago

Changes can be bulk reverted by git revert ba3de18e85d35c Needs testing if this is really faster in larger settings. Changing the feature extractor interface is a serious beauty flaw of this change and its unsure if the processing gets really faster in practice.

Horsmann commented 6 years ago

Speed evaluation showed that this change is not worth it to change the feature signature. ~Closed.