KorAP / Krill

:mag: A Corpus Data Retrieval Index using Lucene for Look-Ups
BSD 2-Clause "Simplified" License
16 stars 3 forks source link

Provide an API to expose index version #62

Closed Akron closed 4 years ago

Akron commented 4 years ago

Currently, the state of an index is unknown. To see if an index has changed since the last visit (newly introduced, deleted or modified documents), the index may respond with a version code. This can be helpful for caching in clients such as the RKorAPClient. To have this feature stable in a distributed environment, it may be enough to return a hash value of arbitrary length instead of an incremental version number.

Once this is implemented, it needs to be served by Kustvakt.

This feature was suggested by @kupietz .

kupietz commented 4 years ago

From the API point of view, statistics could return an informal human readable revision string, leaving it open whether it contains a corpus revision name, a date, a PID, a description or all.

Clients would have to make their cache hash calculations depend on it - or invalidate existing caches once this string changes.

Akron commented 4 years ago

Realising this using either an ETag or a Last-Modified header and in the long run respecting If-None-Match and If-Modified-Since seems to be a reasonable approach.

Akron commented 4 years ago

https://korap.ids-mannheim.de/gerrit/c/KorAP/Krill/+/2746 addresses this issue and introduces a fingerprint method to KrillIndex. I am open for suggestions about how to make this public. After rethinking, ETag is probably not a good idea.

kupietz commented 4 years ago

Maybe we can start with simply exporting the API under an appropriate name. We could then already use it with RKorAPClient and have time for thinking about a http header integration.

Akron commented 4 years ago

https://korap.ids-mannheim.de/gerrit/c/KorAP/Kustvakt/+/2752 adds an experimental X-Index-Revision header to statistics.