Closed ddeboer closed 10 years ago
that could be a useful thing indeed.
if the security is using custom voters that are not based on the user but something else (e.g. the symfony cmf loading some content and then deciding individually if you have the right to see this), then i think its not possible. just saying - means we need to keep that optional and still allow the plain thing.
in terms of priorization, i would push that post 1.0 unless you have a concrete need for it.
Well, it wouldn't hurt to always return a user hash in the HEAD response, even if Varnish isn't configured to vary on it.
Authorization with custom voters wouldn't be supported, no. Is there a Symfony2 security event that fires after all voters have been processed that we can listen to? Though it would make the listener much less efficient, of course.
This feature can wait until after 1.0.
with ez unbundling their user hash tool (see #26) i think the best option would be to simply drop this thing here. it was a quick shot in the LiipCacheControlBundle but the solution by ez sounds a lot better.
we can throw the listener into a gist somewhere or tell people to copy it from the LiipCacheControlBundle if they need it and ez is not ready.
/cc @lolautruche
Well, I think it makes sense to combine insights with the ezPublish guys and and include the functionality in this bundle (if they agree). After all, we're offering a generic caching solution for Symfony2 users here, and properly caching authenticated content is an important issue.
I personally have no objection that we merge our solution in this bundle. However, it means license change since we're in GPL-2.0 and this bundle is MIT, so we need to evaluate the impact. At worst, we could propose a separate bundle that would be complementary to this one (or a dependency).
ping @andrerom @bdunogier
Maybe the separate,depending bundle would make it easier. As long as we move forward together, it shouldn't be a problem. But no strong preference here, to be honest.
how much code is it in the end? a listener for the curl requests from varnish, and another one to append information to the responses? or is it more involved? if its a small thing, it would make more sense to keep in here, but if its complex and big, its making things harder to understand. can you point us to the current ez code for this?
@dbu You can figure this out here : https://github.com/ezsystems/ezpublish-kernel/pull/368 (initial PR for the feature). Complementary PR: https://github.com/ezsystems/ezpublish-kernel/pull/720
Let’s remember this bundle is not only about invalidation, so user-specific cache fits fine.
I agree with @dbu that if the user hashing requires much code, it may be better to do it somewhere else. As stated above, it would be good to experiment a bit with a solution that doesn’t require the the Varnish cURL module. If we can leave that out, the solution becomes simpler.
I'm open to any suggestion :-) Le 25 févr. 2014 12:54, "David de Boer" notifications@github.com a écrit :
As stated abovehttps://github.com/ddeboer/FOSHttpCacheBundle/issues/30#issue-26320943, it would be good to experiment a bit with a solution that doesn't require the the Varnish cURL module.
Reply to this email directly or view it on GitHubhttps://github.com/ddeboer/FOSHttpCacheBundle/issues/30#issuecomment-36000195 .
I’ll come back to you about this. Need to do some experimenting with Varnish first. My hope is that plain restart
will be enough. ;)
thanks for the links. i think the listener part is not too complicated and might still fit (maybe use a response listener instead of setting the vary header in the controller...) but the whole role + identity stuff becomes quite involved. i think if we do any part in this bundle, it should be a more generic and simple thing that ez can then plug into with its user management. user identity goes deep into a project, its not something you happily change just for the sake of one component. (not saying ez is doing this wrong, just that we should have clear boundaries).
maybe we can even do something more generic than a user hash - it could depend on other factors than the user, in some system. like a custom vary header and a custom check to get the right variant hash for a request in varnish.
@dbu As for the user hash generation, the solution is generic. It's just that we have our own implementation relying on our roles. What can be possible is to provide a very generic UserIdentity, based on native named roles for instance. eZ Publish can then add its own rules.
i guess that is what i meant, we should only have a generic interface and leave the implementation to ez and other projects.
I’ll come back to you about this. Need to do some experimenting with Varnish first. My hope is that plain restart will be enough. ;)
All right, I've been experimenting with user hashing with plain Varnish restart
, so without requiring the user to install the Varnish cURL mod. It seems to work. As I’ve used the FOSHttpCache functionality for testing caching headers, you can find my work in a branch on that repo.
The user hash should be definable by the user. Perhaps from configuration (
include_role or something?) but preferably by throwing an event (build_user_hash or something) that the user can add a listener for. This bundle then has its own listener with some sane defaults.
Would that make sense? We could rename the event to generate_hash
to be more general. Or have a configuration option hash_generator_service
that the bundle user can override.
interesting. but the HEAD request will be needed for every request with cookie, right? could we also cache the hash based on the cookie by the backend? (vary on cookie, of course) a service we configure would be more in line with what we do elsewhere in this bundle. on the other hand, an event sounds interesting too, as several listeners could try to generate a hash then. what about combined hashes? that is, i have two things that differ? when would the service/event be triggered? on kernel response i guess?
We could rename the event to generate_hash to be more general. Or have a configuration option hash_generator_service that the bundle user can override.
We chose the hash generator approach, with a service tag, which seems more relevant than an event approach here. The hash generator is called by a request listener, on a very specific request (AUTHENTICATE
, with X-User-Hash
as header).
@ddeboer I looked at your code on Varnish restart
, but there is something I don't understand. You seem to send pure HEAD
requests, but these need to be triggered by an incoming GET
request, goal being to enrich it with a X-User-Hash
header so that one can make the response vary on it. Furthermore, why do you remove the cookie ? We definitely need it to figure out the user identity...
Or am I missing something ?
could we also cache the hash based on the cookie by the backend? (vary on cookie, of course)
Yes, we can make Varnish cache those.
a service we configure would be more in line with what we do elsewhere in this bundle. We chose the hash generator approach, with a service tag
Maybe the service is indeed the easiest solution.
Furthermore, why do you remove the cookie ?
To have Varnish cache the response. By default, Varnish doesn’t cache responses with a cookie. We can also change the VCL to have Varnish cache whether there’s a cookie or not.
You seem to send pure HEAD requests, but these need to be triggered by an incoming GET request,
They are trigged. It's like this:
Client GET (with cookie) Varnish
\/
Varnish HEAD to backend
\/
backend responds with hash calculated based on the cookie
\/
Varnish puts hash on the original GET request, and restarts that request
\/
if in cache, return to client;
if not, fetch from backend first.
@ddeboer OK, perfect. Even better than our solution if the user hash can be cached by Varnish. However, be aware that we had a bunch of issues with caching with the whole cookie string as key, since several services (like analytics) can alter the cookie a lot, making the cache useless. Therefore we improved by only using the session ID as key. Would it be possible with Varnish as well ?
when using varnish, you want to remove cookies like that of google analytics anyways. what we usually do is something like this:
# Cleaning up cookies https://www.varnish-cache.org/trac/wiki/VCLExampleRemovingSomeCookies#RemovingallBUTsomecookies
if (req.http.Cookie) {
set req.http.Cookie = ";" + req.http.Cookie;
set req.http.Cookie = regsuball(req.http.Cookie, "; +", ";");
set req.http.Cookie = regsuball(req.http.Cookie, ";(PHPSESSID)=", "; \1=");
set req.http.Cookie = regsuball(req.http.Cookie, ";[^ ][^;]*", "");
set req.http.Cookie = regsuball(req.http.Cookie, "^[; ]+|[; ]+$", "");
if (req.http.Cookie == "") {
remove req.http.Cookie;
}
}
This is straight from varnish doc, looks rather involved though.
Somebody having a custom logic that differentiates on other cookies will have to adjust this.
And of course the backend must send the x-hash with the correct livetime for the cookie(s) in question, to not keep a user authenticated after his login cookie is no longer valid. (We will then have a new HEAD request to the backend for the hash, and if the cookie/session is no longer valid, will get a login failure i would assume)
Hi,
We have to handle this use case in our current project, and the solution proposed seems perfect i just want to make a summary to be sure that the current workflow it's ok with everyone before proposing a PR.
To allow caching of this part we need to add the value of session_id in the hash
A HashGenerator service is used to generate a hash base on whatever
Each service tag with hash_identifier must implement an IdentityCacheInterface which allow to add a key/value to the cache, (for example we can tag the acl service to add a role: ROLE_USER
to the identifiy)
This hash is returned in the content of the response, if no service is tag, response will be empty
(Like ezPublish is doing actually)
Furthermore the AuthorizationListener will be able to be configured to send the response to HEAD Request with a TTL to allow caching for the 2. part.
Is that ok ?
@joelwurtz More or less, yes. We can still discuss details, whether it’s better to use vcl_hash
or the Vary
header.
One thing I’m not yet happy about is what you described as steps 2–4. If I’m not mistaken, with our current plan, every GET by the client requires a roundtrip to the (Symfony2) backend to calculate the user hash based on the cookie. Can we solve this by having the backend set a cookie containing the user hash (e.g., user+admin
) that Varnish can check for subsequent requests instead of doing a HEAD to the backend? Faking that cookie, however, would be very easy, of course.
with the Vary header, the backend needs to know the name of the bit that varnish is using. ideally only one part (either varnish or symfony) knows of names of such things, so that as little as possible can go wrong because of mismatch.
for the request, can't the header listener simply set cache headers and then varnish will cache that and see that its the same thing when using the same session?
can't the header listener simply set cache headers
Of course, have Varnish cache the user hash based on cookie! I guess that is what @joelwurtz meant with setting a TTL.
at work, we pondered using a memcache client plugin to store a session to authorization group mapping, but that feels somehow weird. i mention this more as a reminder that what we do should not prevent other paths if possible. (if not possible, one can just deactivate the listener and do ones own stuff when needed)
@ddeboer Yes that's what i mean with TTL :)
But this will only work for the same url, pages which are not been seen cannot have a HEAD cached (as it's impossible to know for varnish if the user as the rights, unless it's having a security flaw or logic inside VCL)
I'm not a fan of a Vary on Cookie (As we need to remove the others ones etc ...) Would it be better to extract the session_id in a X-FOSHttpCache-SessionId header and make the vary on it ?
I agree that adding yet another component to the mix (Memcache/Redis) wouldn't make much sense by default. We could make the setting of caching headers in our listener optional, but I would opt for a very simple listener that does good things by default and can be enabled/disabled depending on the user's needs.
Would it be better to extract the session_id in a X-FOSHttpCache-SessionId header and make the vary on it ?
That's exactly how I intended that we use the Vary
header.
But this will only work for the same url, pages which are not been seen cannot have a HEAD cached
Good point. We can also have a controller with route instead of a listener in this bundle. What if Varnish did a HEAD against that controller (so always the same URL)? I guess this works as the user hash is the same for all GET URLs (/articles
, /articles/123
). Or it may be easier to just keep the listener and do the HEAD request always to the same URL (/
).
Good point. We can also have a controller with route instead of a listener in this bundle. What if Varnish did a HEAD against that controller (so always the same URL)? I guess this works as the user hash is the same for all GET URLs (/articles, /articles/123). Or it may be easier to just keep the listener and do the HEAD request always to the same URL (/).
Yes but we only check Authentication in that case and not Authorization, as maybe the user have to right to see /articles/123 but not for /articles/456
IMO The only way to do this is to be able to set different url depending on the hash generated and make a mapping on Varnish but i think this is an overkill
Ooops forget what i say, it's only the hash which is cached and will be different so the next request should handle authorization
(I mess up with naming thing and cache invalidation, a little bit too complex :s)
(I mess up with naming thing and cache invalidation, a little bit too complex :s)
Our conversations about it were full of misunderstandings at eZ too. Complex domain :) Le 29 avr. 2014 18:37, "Joel Wurtz" notifications@github.com a écrit :
Ooops forget what i say, it's only the hash which is cached and will be different so the next request should handle authorization
(I mess up with naming thing and cache invalidation, a little bit too complex :s)
— Reply to this email directly or view it on GitHubhttps://github.com/ddeboer/FOSHttpCacheBundle/issues/30#issuecomment-41699890 .
Hey, i just added a first work of it (based on @ddeboer first attempt) with tests and vcl on this PR https://github.com/FriendsOfSymfony/FOSHttpCache/pull/55
If it seems ok to you, i will write the documentation and the implementation in this Bundle
Implementation available in this PR #57 (in case of you don't have notifications on this project)
Varying cache by user role is now available through #57 and #71. Thanks to everyone involved for working on this one!
The CacheAuthorizationListener listener returns immediately after Symfony's security check for HEAD requests. Currently, it returns an empty response. We can make the listener more useful, though, by returning a hash based on the user's roles. That hash can then be set as a custom header. If the caching proxy is then configured to vary on that header, different responses for different user roles could be cached. Additionally, you'd still be sure each user only sees content that he has access to (because of Symfony's security check).
The user hash should be definable by the user. Perhaps from configuration (
include_role
or something?) but preferably by throwing an event (build_user_hash
or something) that the user can add a listener for. This bundle then has its own listener with some sane defaults.This idea is inspired by the very nice context aware caching in ez Publish. But whereas their solution requires a Varnish module to be installed (CURL), I think our solution will work with plain VCL
restart
.