caddyserver / cache-handler

Distributed HTTP caching module for Caddy
Apache License 2.0
262 stars 19 forks source link

Don't Cache Logged In Users #34

Closed PythonLinks closed 1 year ago

PythonLinks commented 2 years ago

I am building a mostly read web site. A few people edit content, many read it. (Well hopefully someday, many will read it). I would like the anonymous user to see the cached version, but the logged in user should see the most recent version.

In Caddy 1, my application server set images and anonymous user pages to be public. And I set the cachekey to include the jwt token. That way if they were logged in, they had that token, and no one else would see their cached version of the page. Was simple and worked great.

How do I achieve the same effect with cache-handler? Reading through the options, it does not seem possible.

darkweak commented 2 years ago

Hello @PythonLinks, following the RFC the authenticated shouldn't get their response cached by the proxy. That's a bug in the library, I'll open a PR on the Souin repository and write the fix ASAP.

Thank you for this report! 👍

PythonLinks commented 2 years ago

Great. That would make my life so much easier. I am glad I had some positive impact.

So there are a few obvious questions. How do they determine who is authenticated? For me, I used a jwt cookie, if it existed the person was authenticated, and it should not cache.

Of course it is a bit more complicated than that. there are really several different concepts here. When is the page cached, and when is it served. In Caddy 1 all pages got cached. I used the jwt cookie as the cache key, so no one else would see those pages.

If the RFC specified what gets cached and what does not that is fantastic. Best if we had the option, cache this page or not. Then the application server could issue the correct command.

In Caddy 2 Cache Manager you have this great concept of conditional caches. Meaning they are only invoked if the matcher matches. What would be brillant would be a conditional saver. Meaning they only get saved to the cache if they match.

In particular I do want to cache the images seen by logged in users. But not their content. And sometimes I want to cache the javascript of logged in users, it contains their private data, and sometimes I do not want to cache it.

darkweak commented 2 years ago

Resolved by https://github.com/darkweak/souin/pull/245/commits/8352f4925432abb2a524c3b62713d680897d01c4 (not available in the cache-handler repository yet)

darkweak commented 2 years ago

What would be brillant would be a conditional saver. Meaning they only get saved to the cache if they match.

I thought about a rewrite of the cache system configuration related to this issue https://github.com/darkweak/souin/issues/66 to have a conditional cache depending a header/cookie and it could be good to have the headers list in the cache_key tweaking configuration https://github.com/darkweak/souin/blob/b6066c9d1f4150b4cd82fc237979a21b0f6ab92a/configurationtypes/types.go#L113 that can be part of the key name

e.g.

cache_keys:
  '.+\.css$':
    headers:
      - Authorization
      - Accept

Will generate the key GET-domain.com-/{-HEADERS-}Authorization:Bearer%20ey;Accept:*/*

francislavoie commented 2 years ago

If it's just a question of "does the cache handler run or not", then Caddy's built-in request matchers make way more sense to use here than trying to build a matching solution inside of the cache module itself. No need to re-invent the wheel.

If you move some of this logic to the Caddy plugin itself (instead of in souin) then you could embed a MatcherSet into the handler itself to make config decisions. If you look at the code for my PR https://github.com/caddyserver/caddy/pull/4691 you can get an idea of how it looks to embed a matcher inside of another handler. Here the skip_log handler takes a matcher as config and provisions if, then when it runs it calls the matcher to get a boolean result and uses it to decide whether to skip logging that request or not.

Essentially what I'm trying to get at is since this is a plugin under the caddyserver org, it should definitely feel natural to use alongside the other built-in Caddy modules.

PythonLinks commented 2 years ago

I tried again and gave up. Caddy 1 and certbot on Linux work great. I cannot get either caddy 2 nor certbot to work on FreeBSD. After a solid week or two or more of trying, and delaying most of a year, I think I am giving up.

There is too much complexity. there is pkg install and /usr/ports/security There is py-certbot and py-certbot-nginx.

And then there are the py38 and py39 versions of both of those.

I tried doing it all using pkg install, and i tried using ports, and could get nothing to work. With the pkg install I could not find the executables anywhere. I am trying

$> find / -name py39-certbot & $> find / -name py-certbot &

And then using ports, I get errors like missing py-openssl and missing py-cryptography

===> py38-acme-1.21.0,1 depends on package: py38-cryptography>=1.2.3 - not found

and even when I switch to

/usr/ports/security/py-cryptography

And try to make them still no luck.

In contrast debian was a reasonable experience.

darkweak commented 2 years ago

Resolved in Souin v1.6.19.