Challenge: ORB or BRISK are binary features and it is unclear how to cluster them in order to construct bag of words.
(I tried k-median which does not work well: In the end, the difference between words and member features are quite large...)
Question: Why do we want bag of words? (Using Hamming distance, matching is so fast, that words seem to be not necessary.)
Answer: Bag of words have the advantage over plain features that we can generalize and therefore learn useful statistics (e.g. tf, idf).
Idea: Use word descriptors which are two times longer than feature descriptors. Then, 1 bit in the feature descriptor is represented by 2 bits in word descriptor (00: 0%, 01: 33%, 10: 67%, 11: 100%).
Challenge: ORB or BRISK are binary features and it is unclear how to cluster them in order to construct bag of words. (I tried k-median which does not work well: In the end, the difference between words and member features are quite large...) Question: Why do we want bag of words? (Using Hamming distance, matching is so fast, that words seem to be not necessary.) Answer: Bag of words have the advantage over plain features that we can generalize and therefore learn useful statistics (e.g. tf, idf).
Idea: Use word descriptors which are two times longer than feature descriptors. Then, 1 bit in the feature descriptor is represented by 2 bits in word descriptor (00: 0%, 01: 33%, 10: 67%, 11: 100%).
A distance operation can be defined easily: https://github.com/strasdat/ScaViSLAM/wiki/Distance-for-Brief-type-features