Library for document analysis (segmentation, tokenization, normalization, aggregation) with the goal to get a set of items that can be inserted into a strus storage. Also some functions for analysing tokens or phrases of the strus query are provided.
If you want to markup a document with matching patterns, you have either to declare the patterns as exclusive. (%MATCHER exclusive) or rely on the correct implementation of the ousting of matches with lower priority by matches of higher priority. The later mechanism implemented in
does not work. Neither are overlapping matches in the content marked up correctly, nor does the mechanism of eliminating lower level markup of areas by covering higher level areas work.
If you want to markup a document with matching patterns, you have either to declare the patterns as exclusive. (%MATCHER exclusive) or rely on the correct implementation of the ousting of matches with lower priority by matches of higher priority. The later mechanism implemented in
does not work. Neither are overlapping matches in the content marked up correctly, nor does the mechanism of eliminating lower level markup of areas by covering higher level areas work.