Open jan-molak opened 2 weeks ago
Hey @jan-molak
Agreed, that is an interesting idea. So, you mean the following scenario:
For example, I enable caching for GET /api/cats
and set TTL = 1 hour.
On the first request ETag
is saved to headers.json
.
Then, all subsequent requests during 1 hour are served from local cache (as it already does currently).
After 1 hour, each request performs preliminary HEAD request and if received ETag differs from the saved one, performs a real GET request and updates cache.
I also thought about cache-control
header, it can be utilized as well to prolong cache time.
Actually, it is the re-implementation of caching in browser, but why not =)
I'd suggest to let user explicitly enable that behavior (to keep things straightforward by default):
await cacheRoute.GET('/api/cats', {
ttlMinutes: 60,
respectETag: true, // <- keeps cache until ETag changes
});
Then, all subsequent requests during 1 hour are served from local cache (as it already does currently). After 1 hour, each request performs preliminary HEAD request and if received ETag differs from the saved one, performs a real GET request and updates cache.
I was thinking that maybe we could use ETags to expire the cache sooner than the TTL would require. So for example, you set the TTL to a "long time" such as a day or a week. If the option to respect ETags is enabled, then every request performs a preliminary HEAD call to see if the ETag has changed. If it it has, a "real" request is made, if not - the response is retrieved from the cache.
I was thinking that maybe we could use ETags to expire the cache sooner than the TTL would require.
Hmm, that's the opposite one. But making these preliminary requests in every test - wouldn't it slow them down? Even though it's HEAD, it's a network roundtrip anyway and I suppose many APIs will respond to HEAD with significant time as well.
wouldn't it slow them down?
I think it depends on the use case. In our case, we're building a simple Playwright Test and Serenity/JS-based website crawler. We are exploring using playwright-network-cache
to cache API responses and static assets to avoid loading them unless changed. The APIs correctly handle ETags and HEAD requests, so while making a HEAD request incurs a network cost, it's still significantly faster than making a GET request since the response body can be large.
Of course, not all APIs correctly handle ETags and HEAD requests, so having an explicit setting to respectETag
that you proposed is a good approach.
to cache API responses and static assets to avoid loading them unless changed
Maybe in that case TTL should not be set at all? You just use cached data until ETag changes?
I've compared it with HTTP caching in different situations. Your suggestion is more like cache-control: no-cache
, that means data is cached but must be re-validated before each use. Having TTL set - is more like cache-control: max-age=3600
- during that period browser uses cache without contacting the server.
Hi @vitalets! Thanks for your work on
playwright-network-cache
; it looks very promising already!Have you considered making the caching mechanism aware of the ETag header? This would allow consumers to specify a longer TTL that could be reduced should the target content change before the TTL has expired.
I think this could be accomplished by making the
CacheRouteHandler
check if the existing local cache has expired using the current algorithm, and if not, make aHEAD
request to the requested original URL and check if theETag
header has changed since the last time the local time was populated and the localheaders.json
file was created (which file should also contain the previous version of theETag
)I think that conceptually it would be similar to
isUpdated
check, but done before the "real" request is made.