ActiveTriples / linked-data-fragments

Basic linked data fragments endpoint.
Creative Commons Zero v1.0 Universal
15 stars 0 forks source link

Configure Cache Control #44

Open no-reply opened 7 years ago

no-reply commented 7 years ago

From https://github.com/curationexperts/chf-sufia/issues/27#issuecomment-277555860:

Approaches for cache control:

In any of these three cases, questions that need to be answered include:

  • What are the pre-warming needs?
    • Do search needs for an authority require a full pre-load of that authority?
    • Or can search be dependant on an external service?
  • What happens when the cache "invalidates"?
    • do we clear data from the backend,
    • or simply require re-fetch on the next retrieval if the remote is available?
  • What is an acceptable TTL?
  • Does the client ever need to manually invalidate the cache?

I think that's the general shape of the problem.

no-reply commented 7 years ago

A first hack at answering some of these:

I don't have any answers about pre-warming. cc: @hackmastera.

hackartisan commented 7 years ago

@no-reply your answers sound good to me.

no-reply commented 7 years ago

@hackmastera:

At CHF we have agreed we will rely on the external service for search; i.e. availability and retrieval time are less of a concern at cataloging than at display. I am a little nervous about the possibility of multi-day downtime, though, especially w/r/t LC.

:+1: The way I'm thinking of this is that the qa-ldf bridge will support a default search drawing only from items already in the cache. For smaller vocabularies, we would have the option to handle search entirely internally via pre-warming. For larger datasets we would lean on the external service, but have the option to provide search over the cached items during downtime.

I think for most users with a mature repository, having the capacity to search the cache would be enough to keep cataloging moving. Other work arounds could be discussed, but I'm thinking they are beyond project scope.

hackartisan commented 7 years ago

I think for most users with a mature repository, having the capacity to search the cache would be enough to keep cataloging moving. Other work arounds could be discussed, but I'm thinking they are beyond project scope.

I'm not sure about this assumption. For example, an archive moves from collection to collection over time and each collection will require a new set of vocabulary terms, even if they are within broadly related areas. Especially in the realm of personal names, but subjects as well.

But I guess if you assume that cataloging on an entirely new collection is relatively uncommon perhaps it holds up. maybe @catlu could weigh in.

no-reply commented 7 years ago

But I guess if you assume that cataloging on an entirely new collection is relatively uncommon perhaps it holds up.

Yeah, this is basically my assumption. Or at least that this is true enough to be of value. Upstream outages may be frustrating, and prevent folks from working on their highest priorities, but at least it wouldn't necessarily halt work altogether.