Open margaretha opened 1 year ago
It's interesting that this shows up in the KQ-Viewer. The type:text
is an index type and is introduced to help the VC Builder to show allowed operators. With this issue: Do you mean this shouldn't show up in the serialization or is there a bigger issue?
Yes, it shouldn't show up in the serialization and it shouldn't be used in general. There should be no problem with that in the backend since Kalamar only sends the corpus query, not KoralQuery.
Could you please check what request Kalamar actually sends to Kustvakt? I don't get any results sending the example direct API request using OAuth2 token and VPN, while Kalamar shows some results as reported in https://github.com/KorAP/Krill/issues/86.
Well - it is used by the corpus builder and it is used for indexing - so what do you mean by "it shouldn't be used in general"? Yes it is not helpful in a corpus request, but that is not happening.
I am not sure to which query you are refering to.
Well - it is used by the corpus builder and it is used for indexing - so what do you mean by "it shouldn't be used in general"? Yes it is not helpful in a corpus request, but that is not happening.
I suppose it shouldn't be used since it is not part of the KoralQuery doc and not supported in backend. Why is it used by corpus builder and indexing?
I am not sure to which query you are refering to.
sorry for not being clear. I mean the query in https://github.com/KorAP/Krill/issues/86 or the one I wrote above: https://korap.ids-mannheim.de/instance/test/api/v1.0/search?q=ich&cq=availability+%3D+%2FCC-BY.*%2F+%26+docTitle+%3D+%22gingko%22&ql=poliqarp&cutoff=1&state=&pipe= but using Kalamar instead of a direct API request.
The KoralQuery doc currently only covers the request and error reporting stuff - neither the indexing nor the response data format. Krill supports it for indexing (see index/FieldDocument
) and for responses (see response/MetaFieldsObj
). type:text
means, the field is indexed tokenized, so single words can be searched in (like for title) as well as a whole string match works. This obviously means that the operators in the visual corpus builder should differ.
That query doesn't show results to me. The request is:
https://korap.ids-mannheim.de/instance/test/api/v1.0/search?context=40-t%2C40-t&count=25&cq=availability+%3D+%2FCC-BY.*%2F+%26+docTitle+%3D+%22gingko%22&cutoff=true&offset=0&q=ich&ql=poliqarp
Thanks for your explanation.
The query should show results with OAuth2 token and VPN since the Gingko corpus is restricted.
But the VC is limited to CC-BY.*
Sorry you are right. The request shouldn't be restricted to CC-BY.* Besides I made a mistake due to the URL encoding for diacritics etc
For the following query
https://korap.ids-mannheim.de/instance/test?q=Z%C3%BCndkerze&cq=corpusTitle+%3D+%22gingko%22&ql=poliqarp&cutoff=1&state=&pipe=
Kalamar would send the query below to Kustvakt, right?
curl -v -H "Authorization: Bearer token" 'https://korap.ids-mannheim.de/instance/test/api/v1.0/search?q=Z%C3%BCndkerze&cq=corpusTitle+%3D+%22gingko%22&ql=poliqarp&cutoff=1&state=&pipe='
This doesn't seem to be a problem from Kalamar and isn't related to type:text so I suppose we should discuss in https://github.com/KorAP/Krill/issues/86 instead
Yes, this is unrelated. Regarding this topic: I think the corpus assistant shouldn't alter the query serialized by the KoralQuery helper - but I think that's the only problem there is and it's a minor one, not affecting any functionality of the platform.
While investigating https://github.com/KorAP/Krill/issues/86, I found that
corpusTitle eq gingko
is serialized aswhilst
type:text
is not a type supported according to the KoralQuery doc and it is practically also not supported in Krill.The type is not added by any query rewrite as it is not added when sending a direct API request:
https://korap.ids-mannheim.de/instance/test/api/v1.0/search?q=ich&cq=availability+%3D+%2FCC-BY.*%2F+%26+docTitle+%3D+%22gingko%22&ql=poliqarp&cutoff=1&state=&pipe=
Could it be that Kalamar add the type?