Library for document analysis (segmentation, tokenization, normalization, aggregation) with the goal to get a set of items that can be inserted into a strus storage. Also some functions for analysing tokens or phrases of the strus query are provided.
This would be cool because I can then set something like:
docid: bigxmlfile.xml/17
with bigxmlfile.xml containing: