KorAP / Krill

:mag: A Corpus Data Retrieval Index using Lucene for Look-Ups
BSD 2-Clause "Simplified" License
16 stars 3 forks source link

getMatchInfo may return empty contexts #64

Closed margaretha closed 4 years ago

margaretha commented 4 years ago

getMatchInfo sometimes returns empty left and right contexts, although the search results with context base/s:s include the contexts.

Search http://10.0.10.51:9000/api/v1.0/search?q=[tt%3Alemma%3D%22fein%22]&ql=fcsql&v=2.0&context=sentence&count=25&offset=0

Match info with some contexts http://10.0.10.51:9000/api/v1.0/corpus/WUD17/E97/71900/p1590-1591/matchInfo?foundry=*&spans=false

Match info with empty contexts http://10.0.10.51:9000/api/v1.0/corpus/WUD17/C94/39360/p395-396/matchInfo?foundry=*&spans=false

Akron commented 4 years ago

WUD17/C94/39360 has actually - at least in the JSON file - no sentence annotation between 362 and 411. So from Krill's perspective, this seems to be acceptable behavior.

margaretha commented 4 years ago

But there is a sentence annotation from 219 to 407:

<>:base/s:s$<b>64<i>219<i>407<i>53<b>2

Akron commented 4 years ago

These are character offsets - the span starts at position 32 and lasts until 53.

Akron commented 4 years ago

I close this for not being a bug in Krill. Test added in https://github.com/KorAP/Krill/commit/64079c9a709ae2623ecdae11ce74355f7cb7fe04 .