karpetrosyan / hishel

An elegant HTTP Cache implementation for HTTPX and HTTP Core.
https://hishel.com
BSD 3-Clause "New" or "Revised" License
158 stars 19 forks source link

Request extensions are not passed through cache revalidation #247

Closed hartym closed 2 months ago

hartym commented 2 months ago

I notice that cached request extensions are not passed through to the revalidation request.

For example, my original request contained {'timeout': {'connect': 5.0, 'read': 5.0, 'write': 5.0, 'pool': 5.0}} (which are, I believe, set by httpx according to arguments I did set), and also a few custom extensions that an intermediate layer will use for processing (but this is a detail specific to my use, let's ignore that point for now). The request triggering revalidation contains the same extensions.

I wonder if not passing through original request's extensions or the new-request-that-triggers-the-revalidation's extensions to the revalidation request is a conscious choice for a good reason or just an omission.

One of my elements of thinking is that if for whatever reason I set the timeout to 5 secs for the request, there is no good reason (I believe) that the revalidation request gets whatever default timeout it finds.

In my case (specific), I also pass some metadata through extensions that I expect to use in an intermediate transport layer that I added inbetween hishel's AsyncCacheTransport and httpx' AsyncTransport, to be able to manipulate either the request or the response when it goes through cache. Getting the metadata from extensions is kind of important, otherwise I miss a lot of identification / tracing informations there.

Thanks for your thoughts on this (of course, as before, if this is something that you think we do want in hishel, I can work on implementation and tests).

karpetrosyan commented 2 months ago

Thanks for raising these questions. It is something I thought needed some tidying up. I think passing the extensions to the revalidation request would be a user experience improvement, though we may encounter some challenges here, for example:

Though I fully support using the original request extensions for the revalidation request, that makes sense.

hartym commented 2 months ago

There are a few different things here, I'll write a few thoughts :

Which extensions should be stored in the cache ? It looks like as of today, extensions stored in the cache based on allowance lists (KNOWN_REQUEST_EXTENSIONS and KNOWN_RESPONSE_EXTENSIONS) used by the serializers. It would be quite easy to allow serializers to override this on a per-instance basis, keeping the current default as a reasonable one.

Using the original request extensions for the revalidation request : do we agree that the extensions we should use for the revalidation request are the one from the request that triggered the revalidation, not the one from cache ? Let's take an example :

GET /foo (timeout 5sec) -> miss -> HTTP 200 -> store
...
GET /foo (timeout 10sec) -> stale, revalidate -> HTTP 200 -> store

Do we agree that the revalidation request should use the 10seconds timeout ?

I can work on implementing both points, I'll work on them separately though as they are unrelated. Let me know if you're ok.

karpetrosyan commented 2 months ago

Do we agree that the revalidation request should use the 10seconds timeout ?

I think yes, it's more predictable.

hartym commented 2 months ago

PR attached (see comment within).