patrickfrey / strus

Library implementing the storage and the query evaluation for a text search engine. It uses on a key value store database interface to store its data. Currently there exists an implementation based on the google LevelDB library.
http://www.project-strus.net
Mozilla Public License 2.0
47 stars 1 forks source link

how to add markers in forward index #59

Closed andreasbaumann closed 7 years ago

andreasbaumann commented 8 years ago

having spans of meta features with a start and an end (sequence of tokens) it would be nice if the forward index can store:

word begin_marker word word sign word end_marker word

Now the problem is the token positions, because some tokens (begin_marker, end_marker should have the same position as the first and the last word of the span.

andreasbaumann commented 8 years ago

Is this a possible in the forward index? Can I configur this in the analyzer (the position assignment policy)?

andreasbaumann commented 8 years ago

An alternative representation of the problem is:

word marker word

We can replace the sequence of tokens with the meta feature marker and encode the length of the span in tokens into the feature value, so we have a chance to highlight the span. The search index would assume we only search for the meta feature (and we recognize it somehow in the query), we cannot search for the meta feature AND in parallel for tokens withing te meta feature.

patrickfrey commented 7 years ago

I do not see the point of having structures built in the forward index. Structures as spans are defined by elements in the search index or as length attribute of features. The forward index is just there to pick elements. It would be far to inefficient to model structures in the forward index.