KorAP / KorAP-XML-Krill

Merge KorapXML data and create Krill documents
BSD 2-Clause "Simplified" License
1 stars 1 forks source link

Enumeration of sentences/paragraphs in base #2

Open Akron opened 8 years ago

Akron commented 8 years ago

To mimic the extension behaviour of Rabbid (s. screenshot), Krill needs to enumerate snippets, so it is possible to retrieve the next or the previous snippet of a matching snippet via the API. This is no problem with the current default settings of Kalamar (where snippets are retrieved as token ranges), but when supporting snippets in sentence or paragraph boundaries, snippet elements need to be enumerated separately. This is possible by adding attributes to the sentence and paragraph spans in base. However - this has the requirement that snippet elements do not allow gaps.

extension

Akron commented 7 years ago

In addition, this has the requirement that boundary elements are not allowed to be nested. The interesting thing here is, that these requirements may be identical for elements in distance spans (distances mimicing the behaviour of C2 distances).