Closed margaretha closed 3 months ago
I checked and there are actually two limits: one is character based, one is token based. The match has a token based limit, which makes sense for annotation data retrieval. so we may want to have maxMatchTokenSize
. And we have context limits, which are character based, which may make sense as well ... So - for this possibly maxContextCharSize
?
Thanks for checking, Nils! Does maxMatchTokenSize
not include the contexts? Just the matches itself?
whereas maxContextCharSize
include match and context alltogether?
No - maxMatchTokenSize doesn't include the context and maxContextCharSize is a maximum value for left and right context. There is no maximum snippet size, as we allow to cut matches and still allow to view the context. I can't remember the concrete reason, it may have been just simpler to implement. But it also has some advantages.
Krawfish will have some difficulties with character sizes in contexts, but I think we can keep this feature.
So - do you want to have the context adjustable only?
Thanks for the clarification! We have discussed this over Slack and agree to allow adjustment for both variables.
Please add a new parameter to allow Kustvakt to change the snippet size beyond limit. It is necessary to support larger match context for a group of users see (https://github.com/KorAP/Kustvakt/issues/745)
The parameter should be exclusive for Kustvakt and not adjustable by users.