Closed argiepiano closed 2 years ago
A quick update: I'm not able to reproduce this issue in a clean install (despite what I said earlier).
So this may actually be an interaction with an installed module.
Is turning on the error display essential for the steps, or does the duplication happen regardless?
Also, a list of installed contrib modules might help to further investigate.
If that happens only on specific pages - which blocks does that layout contain?
I was able to reproduce on my test site (which is hosted on the same host, using LiteSpeed Web Server, as my live site) following the steps exactly. Here are installed and enabled modules:
Must be a better way of getting modules
The list on the page admin/reports/updates
is much shorter. No Core modules, and it doesn't show every sub module.
@yorkshire-pudding thanks so much for testing. In a way, it's a good thing that you are able to reproduce - it tells me it's not just my site or my hosting company.
@indigoxela: since the bug is throwing an error, I wanted to see if anyone testing would get the same errors. Turning on the error display doesn't make a difference in the bug. But if you turn off error display, you have to go to the watchdog log to see the errors.
In a nutshell, this is what I THINK is happening: when an anonymous user visits a "stale" cached page (in the cache_page
table), and when background fetch is on, the server sends both the stale page AND the newly generated page in the same request. Supposedly, (I believe) background fetching should be sending only the stale page right away, and in the background, cache a new version of the page. This was done to speed up responses when there are long processes involved in generating a page (e.g. when the site runs Cron before the response).
This Backdrop addition was created a few years ago. Apparently it works fine in Apache. This may be a LiteSpeed bug, but I don't know. More time and research is needed... which I don't have right now. So, for now, I have opted for turning background fetching off.
[EDIT] This is the list of modules in my site: Address Field 1.x-1.3.1 Backup and Migrate 1.x-1.0.20 Configurable Block Style 1.x-1.0.6 Content Access 1.x-1.2.1 Devel 1.x-1.8.1 Entity Plus 1.x-1.0.15 Entity Reference 1.x-1.7.0 Entity Tokens 1.x-2.0.0-rc1 Feeds 1.x-1.1.2 FlexSlider 1.x-2.0.1 Font Awesome 1.x-1.2.5 HMS Field 1.x-1.0.0 HTML5 upload 1.x-1.1.2 Job Scheduler 1.x-1.1.4 Mail System 1.x-3.0.6 Mime Mail 1.x-1.1.4 PhotoSwipe 1.x-2.1.3 Profile 1.x-1.1.0 SMTP Authentication Support 1.x-1.3.4 Special Menu Items 1.x-2.1.2 Telephone 1.x-1.0.1 Views Aggregator Plus 1.x-1.0.4 Views Data Export 1.x-3.2.0 Webform 1.x-4.24.2 WebForm 2 PDF 1.x-4.1.0 Webform Calculator 1.x-2.1.0 Webform Views Select 1.x-1.0.1
@indigoxela , it turns out you are right: turning error display IS actually essential to reproduce the error! If you disable all error/warning message display, the error doesn't happen.
So, it looks like the issue happens because PHP throws a warning, and that somehow triggers both the cached page and the new fetched page to be sent together. Those warnings are never logged into watchdog - so, you never know that they happened.
I have also now been able to reproduce this behavior in a clean install. (Yesterday I said I hadn't - but I had error message display off, which is why they did not happen).
It'd be interesting to see if this is also the case with Apache servers. Anyone care to test? What you'd need to do:
I'm not able to reproduce with Apache. I think, I followed the steps exactly:
And error display turned on, of course.
Very odd.
It'd be interesting to see if this is also the case with Apache servers. Anyone care to test?
@argiepiano this is easy to test with our demo sandboxes, since they run on Apache (v2.4.38 currently). Here's what I did:
/admin/config/development/performance
and:
/admin/config/development/logging
, and set all messages to be displayed.No errors, nothing logged in the dblog. So this seems to be limited to the LightSpeed server it seems.
Since we seem to have pretty much narrowed this issue down to a specific web server issue, would a temporary solution be to disable background fetch if LiteSpeed server is detected? (with a @todo
added to further investigate this in the future)
FTR, the background fetch feature serves the purpose of addressing a very specific, low-impact annoyance, which is a slow page load in low-traffic sites when the site has not been visited by anyone in a while. This annoying thing only happens to the first site visitor that comes to the site after a while, and IMO it is not worth serving "broken pages" on account of that.
The problem only manifests with that enabled and errors set to display. As most production sites should have errors disabled from display, please can we only have background fetch disabled if:
webServer = LiteSpeed && errorMessageDisplay != 'None'
@yorkshire-pudding yes 👍🏼 ...I meant to introduce a backdrop_is_litespeed()
function to be used in #5524 and elsewhere anyway 😉
I still wonder, what makes LiteSpeed choke on messages + background fetch.
I don't see an obvious reason for that.
I agree - my host said it should be a drop-in replacement, so there must be a very subtle difference.
I still wonder, what makes LiteSpeed choke on messages + background fetch.
I don't see an obvious reason for that.
I spent a bit of time on Saturday trying to figure that out. It's very difficult to debug, unless you have LiteSpeed installed in your local. For example, watchdog doesn't have any effect at that point (probably not bootstrapped yet to that level). Same thing with dpm.
I did find out a couple of things in the duplication: the first elements are sent from cache (I manually modified it to test), while the duplication is generated on the fly. It seems like the error we saw in my original post always happens in LiteSpeed, but it affects the response only when you have the error display on.
All of what follow is something I can't confirm or verify (since this is so hard to debug), but it looks like, when the error happens and the error display is on, Backdrop decides to generate the page and send it again, including headers (despite the fact that it had already sent it from cache). This may actually be "expected" behavior when you have an error (sending a newly generated page instead of using the cache) - but I haven't verified that. So, there may be a bug after all, except that in Apache that error is never generated, and this situation is never encountered.
Again all of this is speculation and it may not be accurate. One thing that may help is to find out a way to debug without outputting stuff to the browser. I thought of writing a little function that saved stuff to a file instead, but I have found myself very short on time with a few deadlines these days. Any thoughts on this?
Another piece of weird info: the errors that you see in my screenshot in my original post: they are themed by Backdrop, but they are never logged into the dblog/watchdog
The main location of Backdrop's background fetch cache code is in _backdrop_bootstrap_page_cache()
. The central concept is that when a cached page expires, Backdrop serves one stale copy back to the browser, but keeps the PHP process open to generate a fresh cache for the next request.
This section of code in particular is responsible for closing the connection between the web server and PHP. Which is intended to let PHP continue processing while the web server returns the cached page to the browser:
// Close the connection immediately if we are about to serve an expired
// cache entry. This allows for background processing without holding up
// the browser.
if ($cache_expired) {
header('Connection: close');
}
// End the request and send the response to the browser. This call has
// error suppression on it as some PHP versions may require it (PHP 5.3)
// but other PHP versions throw a warning (PHP 5.5).
@ob_end_flush();
// Flushing for PHP-FPM based installations.
if (function_exists('fastcgi_finish_request')) {
fastcgi_finish_request();
}
// Flushing for Apache mod_php installations.
else {
flush();
}
I'm guessing what's happening here is that LiteSpeed needs a specialized way of having PHP return HTML and close the connection to the web server. If the connection fails to be closed, you would end up with the same page rendered twice (one the old cached copy, and a second fresh copy). I did some searching and it looks like Symfony ran into the same problem recently: https://github.com/symfony/symfony/issues/42293. There apparently is a very poorly documented function litespeed_finish_request()
that serves the same purpose for LiteSpeed servers.
Separately, I'm not sure this really counts as a bug report, since we only officially support Apache and nginx (see Requirements). If we add this (add the corresponding backdrop_is_litespeed()
function), we may be indicating official support.
I don't have a LiteSpeed server set up to test, but I think this PR should solve the fundamental problem: https://github.com/backdrop/backdrop/pull/3981.
I've reviewed and tested on my LiteSpeed server, and this patch solves the issue. LGTM 👍🏽 Thanks, @quicksketch!
I'm not sure how to mark this issue if it's not considered a bug. Feature request? Task?
I'm not sure this really counts as a bug report, since we only officially support Apache and nginx ... we may be indicating official support.
Fair enough. Then it's (sort of) a new feature. I think, we should support LiteSpeed, as its usage seems to increase.
I've tested on my LiteSpeed Server and it fixes the issue. I agree that we should support LiteSpeed going forward; my host, Kualo, has moved to it and more and more are using it. If we want to badge Backdrop as being suitable for shared hosting, then we should probably support this as more shared hosts implement it. There is an open source version (Open LiteSpeed).
Thanks @argiepiano and @yorkshire-pudding! Could either (or both) of you take a look at the other LiteSpeed issues we have open?
I merged https://github.com/backdrop/backdrop/pull/3981 into 1.x and 1.21.x. "Bug" or "Feature", I don't think there's any harm in getting this into the next bugfix release.
Thanks @quicksketch - I've looked at all those: I've tested #5524 and #5150 and both work. I've added comments to #3584 and #3386 to confirm that I do not experience these issues on LiteSpeed. I think they are/were limited just to OpenLiteSpeed. It may be that these are now sorted, but I'm not sure how to test that as setting up a test server outside lando is a bit beyond me.
Thank you @yorkshire-pudding! I didn't realize there would be a difference between LiteSpeed and OpenLiteSpeed. I appreciate your checking on each of them.
Description of the bug
When using "background fetch" (see admin/config/development/performance) of stale anonymous cached pages, Backdrop (or the LiteSpeed server?) seems to send headers and content twice.
I've been able to reproduce in a LiteSpeed production server with different sites,
including a clean Backdrop install. Have NOT been able to reproduce on a local Apache one.This issue causes duplication of content and a lot of problems when rendering the page (see attached image), among them also
Warning: session_set_save_handler(): Cannot change save handler when headers already sent in backdrop_session_initialize() (line 245 of /home/tenutowe/public_html/backdrop/core/includes/session.inc).
Steps To Reproduce
These are very specific steps. Use two browsers, or one browser window on "normal" mode and another on "incognito" . Again, this is an issue in LiteSpeed server as far as I know.
On a different browser (browser B):
Upon logout you will be redirected to the Home page. This version of the Home page is retrieved from a stale entry in the cache_page table, which caches anonymous pages. You should see duplication of elements and the Warning I printed above.
You can verify that the response came from the Backdrop cache by opening the Developer Tools of the browser, going to "Network", selecting the document, and clicking "Header" to see the response headers received. They should indicate a cached page on the Response header
x-backdrop-cache: HIT
(if you see MISS, you are not seeing a cached page).Actual behavior
See above
Expected behavior
The page should load without any issues.
Additional information
ANOTHER SNAPSHOT, FROM A CLEAN INSTALL: