contao / core-bundle

[READ-ONLY] Contao Core Bundle
GNU Lesser General Public License v3.0
123 stars 58 forks source link

Strip x-cache-tags header #1642

Closed leofeyer closed 6 years ago

leofeyer commented 6 years ago

I think the x-cache-tags header should be stripped:

Toflar commented 6 years ago

PR to allow for that is ready https://github.com/FriendsOfSymfony/FOSHttpCache/pull/422, we'll see how fast the new version might be released, otherwise we can just provide our own listener until it's released. But let's give @dbu a few days to review what I did first πŸ˜„

leofeyer commented 6 years ago

This is a bigger problem than I first though. It will also lead to upstream sent too big header while reading response header from upstream errors with the default Nginx settings:

fastcgi_buffers 8 4k
fastcgi_buffer_size 4k

I had to double those values to make it work:

fastcgi_buffers 16 8k
fastcgi_buffer_size 8k

The home page of the COD sends the following x-cache-tags header:

contao.db.tl_module.23,contao.db.tl_module.24,contao.db.tl_module.47,contao.db.tl_module.1,contao.db.tl_module.36,contao.db.tl_article.38,contao.db.tl_content.158,contao.db.tl_content.160,contao.db.tl_content.315,contao.db.tl_content.316,contao.db.tl_content.159,contao.db.tl_module.38,contao.db.tl_article.1,contao.db.tl_content.1,contao.db.tl_content.46,contao.db.tl_module.1,contao.db.tl_article.37,contao.db.tl_content.47,contao.db.tl_content.40,contao.db.tl_content.41,contao.db.tl_content.42,contao.db.tl_content.116,contao.db.tl_content.44,contao.db.tl_content.45,contao.db.tl_module.4,contao.db.tl_module.37,contao.db.tl_module.61,contao.db.tl_news.7,contao.db.tl_news_archive.1,contao.db.tl_news.3,contao.db.tl_news_archive.1,contao.db.tl_news.8,contao.db.tl_news_archive.1,contao.db.tl_module.5,contao.db.tl_calendar_events.5,contao.db.tl_calendar.3,contao.db.tl_calendar_events.5,contao.db.tl_calendar.3,contao.db.tl_calendar_events.5,contao.db.tl_calendar.3,contao.db.tl_calendar_events.5,contao.db.tl_calendar.3,contao.db.tl_calendar_events.5,contao.db.tl_calendar.3,contao.db.tl_calendar_events.5,contao.db.tl_calendar.3,contao.db.tl_calendar_events.5,contao.db.tl_calendar.3,contao.db.tl_calendar_events.5,contao.db.tl_calendar.3,contao.db.tl_calendar_events.5,contao.db.tl_calendar.3,contao.db.tl_calendar_events.5,contao.db.tl_calendar.3,contao.db.tl_calendar_events.5,contao.db.tl_calendar.3,contao.db.tl_calendar_events.5,contao.db.tl_calendar.3,contao.db.tl_calendar_events.5,contao.db.tl_calendar.3,contao.db.tl_calendar_events.5,contao.db.tl_calendar.3,contao.db.tl_calendar_events.5,contao.db.tl_calendar.3,contao.db.tl_calendar_events.5,contao.db.tl_calendar.3,contao.db.tl_calendar_events.5,contao.db.tl_calendar.3,contao.db.tl_calendar_events.5,contao.db.tl_calendar.3,contao.db.tl_calendar_events.5,contao.db.tl_calendar.3,contao.db.tl_calendar_events.5,contao.db.tl_calendar.3,contao.db.tl_calendar_events.8,contao.db.tl_calendar.3,contao.db.tl_calendar_events.8,contao.db.tl_calendar.3,contao.db.tl_calendar_events.8,contao.db.tl_calendar.3,contao.db.tl_calendar_events.8,contao.db.tl_calendar.3,contao.db.tl_calendar_events.8,contao.db.tl_calendar.3,contao.db.tl_calendar_events.8,contao.db.tl_calendar.3,contao.db.tl_calendar_events.8,contao.db.tl_calendar.3,contao.db.tl_calendar_events.8,contao.db.tl_calendar.3,contao.db.tl_calendar_events.8,contao.db.tl_calendar.3,contao.db.tl_calendar_events.8,contao.db.tl_calendar.3,contao.db.tl_calendar_events.8,contao.db.tl_calendar.3,contao.db.tl_calendar_events.8,contao.db.tl_calendar.3,contao.db.tl_calendar_events.8,contao.db.tl_calendar.3,contao.db.tl_calendar_events.8,contao.db.tl_calendar.3,contao.db.tl_calendar_events.8,contao.db.tl_calendar.3,contao.db.tl_calendar_events.8,contao.db.tl_calendar.3,contao.db.tl_calendar_events.8,contao.db.tl_calendar.3,contao.db.tl_calendar_events.8,contao.db.tl_calendar.3,contao.db.tl_calendar_events.8,contao.db.tl_calendar.3,contao.db.tl_calendar_events.8,contao.db.tl_calendar.3,contao.db.tl_calendar_events.7,contao.db.tl_calendar.3,contao.db.tl_calendar_events.7,contao.db.tl_calendar.3,contao.db.tl_calendar_events.7,contao.db.tl_calendar.3,contao.db.tl_calendar_events.7,contao.db.tl_calendar.3,contao.db.tl_calendar_events.7,contao.db.tl_calendar.3,contao.db.tl_calendar_events.7,contao.db.tl_calendar.3,contao.db.tl_calendar_events.7,contao.db.tl_calendar.3,contao.db.tl_calendar_events.7,contao.db.tl_calendar.3,contao.db.tl_calendar_events.7,contao.db.tl_calendar.3,contao.db.tl_calendar_events.7,contao.db.tl_calendar.3,contao.db.tl_calendar_events.7,contao.db.tl_calendar.3,contao.db.tl_calendar_events.7,contao.db.tl_calendar.3,contao.db.tl_calendar_events.7,contao.db.tl_calendar.3,contao.db.tl_calendar_events.7,contao.db.tl_calendar.3,contao.db.tl_calendar_events.7,contao.db.tl_calendar.3,contao.db.tl_calendar_events.7,contao.db.tl_calendar.3,contao.db.tl_calendar_events.7,contao.db.tl_calendar.3,contao.db.tl_calendar_events.7,contao.db.tl_calendar.3,contao.db.tl_calendar_events.7,contao.db.tl_calendar.3,contao.db.tl_calendar_events.7,contao.db.tl_calendar.3,contao.db.tl_calendar_events.6,contao.db.tl_calendar.3,contao.db.tl_calendar_events.6,contao.db.tl_calendar.3,contao.db.tl_calendar_events.6,contao.db.tl_calendar.3,contao.db.tl_calendar_events.6,contao.db.tl_calendar.3,contao.db.tl_calendar_events.6,contao.db.tl_calendar.3,contao.db.tl_calendar_events.6,contao.db.tl_calendar.3,contao.db.tl_calendar_events.6,contao.db.tl_calendar.3,contao.db.tl_calendar_events.6,contao.db.tl_calendar.3,contao.db.tl_calendar_events.6,contao.db.tl_calendar.3,contao.db.tl_calendar_events.6,contao.db.tl_calendar.3,contao.db.tl_calendar_events.6,contao.db.tl_calendar.3,contao.db.tl_calendar_events.6,contao.db.tl_calendar.3,contao.db.tl_calendar_events.6,contao.db.tl_calendar.3,contao.db.tl_calendar_events.6,contao.db.tl_calendar.3,contao.db.tl_calendar_events.6,contao.db.tl_calendar.3,contao.db.tl_calendar_events.6,contao.db.tl_calendar.3,contao.db.tl_calendar_events.6,contao.db.tl_calendar.3,contao.db.tl_calendar_events.6,contao.db.tl_calendar.3,contao.db.tl_calendar_events.6,contao.db.tl_calendar.3,contao.db.tl_calendar_events.6,contao.db.tl_calendar.3,contao.db.tl_calendar_events.6,contao.db.tl_calendar.3,contao.db.tl_module.35,contao.db.tl_module.38,contao.db.tl_article.30,contao.db.tl_content.91,contao.db.tl_module.25,contao.db.tl_module.30,contao.db.tl_article.50,contao.db.tl_content.118,contao.db.tl_module.50

Toflar commented 6 years ago

Sure, there are many tags πŸ˜„

leofeyer commented 6 years ago

Can we remove the contao.db. prefix to decrease the header size?

Toflar commented 6 years ago

That makes no sense at all. For us the header length wonβ€˜t be a problem. For Varnish youβ€˜d go with multiple xkey headers anyway and then thereβ€˜s still gzip which would reduce the size of your example to like 0.24kb :)

leofeyer commented 6 years ago

But the prefix is meaningless, isn't it? It could as well be contao. or nothing at all.

Toflar commented 6 years ago

The prefix is important so it does not collide with tags third party bundles will add. And again, because they all use the same prefix, when gzipping it wouldn't make any difference.

leofeyer commented 6 years ago

Ok, I still think a shorter prefix would do as well, but whatever. πŸ˜„

dbu commented 6 years ago

i just reviewed the PR on the library and asked for some improvements on wording, but fully agree we need to solve this and am happy with the solution @Toflar found. i can tag a new minor version after merging so that you can start using this.

Toflar commented 6 years ago

@dbu thank you so much for helping us out on this! I just looked at the header @leofeyer got again and I see a lot of duplicate tags. Shouldn't the response tagger also do an array_unique() here:

https://github.com/FriendsOfSymfony/FOSHttpCache/blob/master/src/ResponseTagger.php#L126

leofeyer commented 6 years ago

Thank you @dbu and @Toflar. Your help is much appreciated.

leofeyer commented 6 years ago

After merging contao/manager-bundle#80 the list of tags is much shorter thanks to array_unique(). @Toflar Is it correct that the x-cache-tags header is still visible in dev mode?

Toflar commented 6 years ago

Well...correct...define what's "correct" πŸ˜„ In dev mode we have no HttpCache so none of the HttpCache listeners is executed (= no tags removed and the tags are not handled anyway). It allows you to debug the tags that are sent. But it might still cause the header too big. Not sure what to do here, really. Either we don't tag the responses in dev mode (= cannot debug anymore) or we leave it as is (= risking header too big) or we do even more complicated stuff like moving the tags away from the response into the SF profiler. But well...

leofeyer commented 6 years ago

Moving the debug information into the Sf profiler sounds reasonable. But I guess that's a Contao 4.7 task, isn't it?

Toflar commented 6 years ago

Certainly not something I can do right now. And also, that sounds like it would be a nice addition to the HttpCacheBundle anyway and not Contao itself. The bundle already extends the library ResponseTagger. We could extend that with an option to configure the behaviour and add a data collector to the webprofiler for the HttpCacheBundle.

leofeyer commented 6 years ago

I have added this to the agenda of the next developer's meeting.

dbu commented 6 years ago

not sure if we should add that to FOSHttpCacheBundle. it will only help when using the symfony cache kernel, but not with varnish.

for tag invalidation, we have a mechanism to send multiple requests if the list of tags is too long. but on the response, i don't think we can split that header. or could we add multiple headers with the same name?

if single tags are excessively long, we could also see about hashing them to 8 characters.

Toflar commented 6 years ago

not sure if we should add that to FOSHttpCacheBundle. it will only help when using the symfony cache kernel, but not with varnish.

Why? It would allow you to easily debug what tags are sent on the response? Or would you do that using std.syslog()?

or could we add multiple headers with the same name?

According to the RFCs that must be allowed. But I'm pretty sure Symfony doesn't support that (there's some special handling for Set-Cookie but it doesn't support it in general).

dbu commented 6 years ago

Why? It would allow you to easily debug what tags are sent on the response? Or would you do that using std.syslog()?

i would simply look at the header of the response to see what tags are on the response. i don't see what a web profiler plugin would add over that. but maybe i am missing something and there would be more useful information that can be made available?

Toflar commented 6 years ago

The problem is you cannot look at the header if it is too long because your server will just fail saying the header value is too long πŸ˜„

aschempp commented 6 years ago

I think this is what @dbu meant. The Symfony profiler already shows all response headers, so we don't need anything for the cache headers.

bildschirmfoto 2018-08-07 um 17 14 57

leofeyer commented 6 years ago

Then we should try to strip the headers in dev mode as well and check if they still appear inside the debug console.

Toflar commented 6 years ago

Ah true, I see. Also, Symfony's Response does support multiple headers so we could indeed split them into multiple headers in the ResponseTagger.

Stripping them in dev mode does not work, there's nowhere you can do that (except for editing the app_dev.php before $response->send();).

leofeyer commented 6 years ago

But even if the tags were split into multiple headers, they would still be sent in dev mode, wouldn't they? So it seems we need a permanent solution to strip the header (or headers) before sending the response.

Toflar commented 6 years ago

No, we want them to be sent in dev mode?

leofeyer commented 6 years ago

No, we want them to be shown in the Symfony debug bar but we do not want them to be sent. See https://github.com/contao/manager-bundle/commit/3759c0176ebaf174f7bb0ac6739e60d6564eba9a

Toflar commented 6 years ago

No that's not correct. We want them to always be present.

leofeyer commented 6 years ago

What's the point of https://github.com/contao/manager-bundle/pull/80 then?

Toflar commented 6 years ago

In production, they need to be removed because we have our own proxy. But the dev mode is equal to the prod mode without HttpCache for e.g. Varnish. The tags must be sent so you can use the headers.

leofeyer commented 6 years ago

so you can use the headers

For what?

Toflar commented 6 years ago

Cache tagging?

leofeyer commented 6 years ago

Then why do we strip them in contao/manager-bundle#80?

Toflar commented 6 years ago

I don't understand what's so complicated about that. Again:

leofeyer commented 6 years ago

it makes absolutely no sense to keep this solution once splitting is implemented.

I agree. And I also don't see any reason why we need it before splitting is implemented. Do you?

Toflar commented 6 years ago

Because otherwise the header is too long? Can we have a call? This is leading nowhere πŸ˜„

leofeyer commented 6 years ago

No, its fine. This is actually the answer to my question from an hour ago. πŸ˜„

But IMHO we should revert the changes then. The header size has already decreased dramatically thanks to array_unique() and stripping the header means it will not work with Varnish. Do you agree?

Toflar commented 6 years ago

We only strip it in dev mode, which will never be used for Varnish πŸ˜„ And stripping them just makes sure it never crashes. Just leave it as is, it's fine. We can remove that again once FOSHttpCache supports splitting up the header into multiple ones. πŸ˜„

leofeyer commented 6 years ago

Ok.