GateNLP / python-gatenlp

Python text processing, pattern matching, and NLP framework
https://gatenlp.github.io/python-gatenlp/
Apache License 2.0
63 stars 8 forks source link

Add util func: virtual text based on ann features #18

Open johann-petrak opened 4 years ago

johann-petrak commented 4 years ago

As in the stringannotation plugin but more flexible.

One or more texts, based on split anns, insert sep chars or not, insert if, insert from lambda

johann-petrak commented 4 years ago

This should return the text and the offset mapping. It should be possible to add placeholder text for specific annotation types, so instead of retrieving the text from the feature or underlying document, just add some constant text (e.g. a single space) if the annotation is encountered.

johann-petrak commented 3 years ago

Possible signature: text, offsetmap = Document.virtualtext(...) with parameters:

The offset map is just an array[int] with the same length as the returned text, containing the document offset for each text offset.