Closed margaretha closed 5 years ago
I guess this requires a change to Krill as well as a change to Kustvakt, right?
Yes, I can adapt the code in Kustvakt when the function in Krill is ready. It can be a separate API in addition to the normal search.
In Krawfish a snippet is a separate function that can "enrich" a match in the same sense as, e.g., fields. I think that's preferable. In the long run, we may want to follow that, because it would allow to have enrichment-specific parameters (like context for snippets). I haven't thought of a REST-API to implement this yet, but it may be beneficial. Because the response would be similar, I wouldn't introduce a new API. Maybe for the moment a "no-snippet" parameter would suffice.
Actually, I meant a new API in Kustvakt because it doesn't involve user authentication, but I think it can be handled in the existing API too. I'll look into it when the function in Krill is ready.
I thought the rewrite detector may just not take effect if no snippet was requested.
After a brief discussion with @kupietz regarding the required user authentication we may need to introduce a query parameter like rewrite=false
, that will not rewrite the query (or the vc to be more specific - so we may want to have two parameters: corpus-rewrite
and query-rewrite
) but fail, in case the user requests information only an authorized user can retrieve. If, for example, a user requests snippets or any other "protected" field, the query will fail. If the user requests open metadata, the query will run and return all metadata ignoring that the user may not have access to the specific corpora.
We should also handle VC reference. I would suggest that reference to system VC should be allowed but reference to VC owned by users should not be allowed. With corpus-rewrite=false, VC reference rewrite would however be disabled (https://github.com/KorAP/Kustvakt/issues/11).
That's a very good point. Otherwise information about the content of a private VC could be leaked. Maybe the name access-rewrite
would be better and would refer to corpus and query? Regarding private vs. public VCs we could handle it as with public vs. private metadata or snippets: When something private is requested, the query fails.
This issue should partially be moved to Kustvakt.
Krill should support search queries returning only all metadata without match snippets, thus allowing search on all data without license restrictions.
Metadata should be return for every match regardless of redundancy.