modxbot / migrate

A testground for migrating issues and other such fun
0 stars 0 forks source link

Blank startpage, somehow caused by cache #3111

Closed mindeffects closed 12 years ago

mindeffects commented 13 years ago

mindeffects created Redmine issue ID 3111

I have a strange problem with two different websites running MODx Revo 2.0.4-pl2 and 2.0.5-pl: The start page (id=1) "forgets" the value of "[ [publishedon] ]" from time to time. The field is just empty. I have not checked the actual MySQL value, because the problem does not occur so often (but often enough). This results in the disappearance of the whole website since MODx thinks, that the page has not yet been published and does not create any output. Just a blank white page. And this is funny, because ALL other pages work fine. It's just the start pages (id=1) that are affected.

The band thing is: I am not alone! Look here: [[http://modxcms.com/forums/index.php/topic,58532.0.html]]

Clearing the cache usualy helps, but only if you know that the page is "down" and revive it manualy. Since there is nothing, not even a menu, to be found on the blank page, the visitor has no chance to go to one of the existing other pages and leaves, tries again tomorrow, only the get the white page again, and NEVER comes back. And that sucks a lot.

Update: The "white ghost" hit again! This time I got the chance to dig a little in the database and found: nothing! The date fields had the usual entries (and this time a value was visible in the field "publischedon" (damn)). The only thing I saw was that "created by" had the value "0" (not "1") which obviously means, that this page was created by MODx and not a user. Anyway, changing this value did not help. In fact, nothing of the DB manipulations helped.

But: I opend the resource (the white startpage), change something (inserted a blank and removed it again) and hit "save". Since "cachable" and "empty cache" was set to "on" the cache was refreshed and bamm the site was back to normal!

So, I dare to make a statement here: The problem is the cache!

Some workarounds:

  1. disable the cache fpr that page (and see if it helps)
  2. disable the system cache (big thing for an error only on one page)
  3. change the "Expiration time for default cache" to get a clean cache after some time, e.g. 12 hours or so.

I went for 3 and see what happens. On the other website I will do 1.

Man, this is a realy annoying bug! Since there does not seem to be any rule, I cannot understand the circumstances on which the error occurs. This is driving me nuts (and the massive snow outside ;-)

I would realy like to supply you guys with al the information there is. But "MODX System Info" kinda sucks, because I cannot copy/paste the text because of that massiv div-"flood". Would be a nice "make bug tracking easier"-feature for 2.0.6 just to use text or - now comes the bad word - a table. The law: Only use tables where they make sense! Like in this case...

OK, how can I get you the information?

Browser: Does not matter. Blank page on all Browsers, all OS, all Lifeforms.

Rest: I atteched a fle with the "system infos" (at last ;-). Maybe this helps.

THANKS A LOT FOR MODx AND A YOUR SUper GREAT WORK! Oliver

modxbot commented 13 years ago

danny_kay1710 submitted:

The server my site is running in is an 8 Core Xeon@2.00Ghz with 16GB RAM. It is more than capable of the load it is running.

Again the big question surrounding all of this is why just the start page. The error pages are all directed elsewhere and have been tested to ensure the setting is actually applying.

Surely if it was a bug in that version of PHP then it would occur on all pages not just one? A bug in PHP that at an unforeseen time completely prevents only a certain page in a single application framework from working seem's just a little far-fetched to me.

I am happy to provide any more information that you require. Please just ask.

cyclissmo commented 13 years ago

cyclissmo submitted:

@Ryan: Did you get my forum PM? I posted a technique to trigger the white screen. I didn't think it would be prudent to have a script in the open that could take some Revo sites offline. Let me know how I can help.

mindeffects commented 13 years ago

mindeffects submitted:

Mike Zeballos wrote:

@Ryan: Did you get my forum PM? I posted a technique to trigger the white screen. I didn't think it would be prudent to have a script in the open that could take some Revo sites offline. Let me know how I can help.

@Mike: I would also love to have your script, since my own script did not manage to trigger the white screen :-( I just have to make sure, that this one client of mine will not get blanked again and that I did all to prevent it! You find my e-mail contact at www.mindeffects.de. Thanks in advance! Oliver

Greex commented 13 years ago

greex submitted:

Hello from another german user,

first: After working with Wordpress, Drupal, Typo3, Contao ... ModX is the BEST CMS I ever worked with and I already infected some other Website-Workers with the ModX-Virus. Even my customers are very happy with the manager ... but this error here ist able to ruin all expectations.

I think I have the same thing here. I have to say, that I already changed the full Server because of this error. The new server has a totaly different setup than the first one, but the same error occurs:

Without any viewable reason, the Startpage went from >6KB to 462 Byte and throw a 500 Server error. All other pages and the manager are fine. After deleting the cache, the start page is back again. A few days ago, I disabled caching for the start page. But the day later my customer called me, that another site was gone. He was able for himself to delete the cache, but he was not happy :/ So I enabled the cache again and set up a "Is-Alive" Script to the start page.

To say something more about the setup:

This website is the only one on this "new" server. It's a dedicated root server with power for a lot more sites than this small one. Something more: The site moved from a very old CMS after 6 years and there was a decision not to move all files, URLs and images to the new site. So I have a lot of 404/301, especially from bots visits.

So the access_log looks like this:

38.99.96.89 - - [13/Feb/2011:00:35:53 +0100] "GET /index.php?ref=nf&pgID_Newsticker=1&menuid=366 HTTP/1.1" 200 6114 "-" "Mozilla/5.0 (compatible; ScoutJet; +http://www.scoutjet.com/)"
38.99.96.89 - - [13/Feb/2011:00:36:01 +0100] "GET /index.php?ref=nf&pgID_Newsticker=2&menuid=366 HTTP/1.1" 200 6111 "-" "Mozilla/5.0 (compatible; ScoutJet; +http://www.scoutjet.com/)"
66.249.72.105 - - [13/Feb/2011:00:36:01 +0100] "GET /?newsid=331&rss=1&menuid=366&page=97 HTTP/1.1" 200 5845 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
38.99.96.89 - - [13/Feb/2011:00:36:09 +0100] "GET /index.php?ref=nf&pgID_Newsticker=3&menuid=366 HTTP/1.1" 500 462 "-" "Mozilla/5.0 (compatible; ScoutJet; +http://www.scoutjet.com/)"

The requests to /index.php?xyz are a heritage from the old CMS (Super-SEO work ... :/ ) First request: Fine Second request: Fine Third request: Fine Fourth request from the same bot than 1st and 2nd: Boom!

Here another example from two days earlier:

217.231.92.66 - - [11/Feb/2011:11:30:46 +0100] "GET / HTTP/1.1" 200 6185 "http://www.cycling-cup.de/index.php?id=24" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 6.1; WOW64; Trident/4.0; SLCC2; .NET CLR 2.0.50727; .NET CLR 3.5.30729; .NET CLR 3.0.30729; Media Center PC 6.0; .NET4.0C)"
66.249.68.169 - - [11/Feb/2011:11:30:46 +0100] "GET /?home=www.dee...d3.txt%3F&pgID_Newsticker=2&page=49 HTTP/1.1" 200 6099 "-" "Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)"
[... a lot of images out of a newsletter ...]
94.216.238.117 - - [11/Feb/2011:11:32:46 +0100] "GET / HTTP/1.0" 500 411 "-" "-" <-- a "is alive" call from a small script that i wrote to react as quickly as possible
80.153.229.243 - - [11/Feb/2011:11:33:51 +0100] "GET / HTTP/1.1" 500 365 "-" "Mozilla/4.0 (compatible; MSIE 8.0; Windows NT 5.1; Trident/4.0; GTB6.5; .NET CLR 1.1.4322)"

I am not sure, but the problem only seems to occur after a bot visit in my logs. Not sure if it is because the site has not that much traffic a the moment and bots are coming quite often the real users, but perhaps it's an idea.

I would be so glad if you can look at this. I have a good relation to my customer and his happiness about the manager is still high enough that he give me some time to fix this problem. But I'm not sure how long this will be ok and I'm getting not so much sleep in the last days.

Sorry for my poor english.

Greeting from germany,

Sebastian

opengeek commented 13 years ago

opengeek submitted:

There are some significant cache refactorings now in the develop branch for 2.1.0 release, and more on the way, that should help avoid any potential conflicts involving file locking and reduce chances these blank page are being caused by the caching system itself. I'll create a build in the next few hours with these latest changes for testing if anyone wants to see if it resolves their issues, and will be working on documentation for best practices in developing caching strategies/configurations based on various deployment profiles.

esnyder commented 13 years ago

esnyder submitted:

I have a better workaround than making the homepage uncacheable, but it only works if your homepage can be served equivalently as a static HTML file.

Simply copy the HTML source code for the homepage to a text file, and call it index-static.html.

Then add this line to your .htaccess right before the friendly URL redirect

RewriteRule ^$ index-static.html

This rewrites (note that it's a rewrite not a redirect, that's important) requests for root to the static file, while still allowing MODx to serve requests for all other pages. Requests for root will load fast, and the rest of the site can be served fast from the cache.

Keep in mind that you'll need to update index-static.html whenever you make changes that affect the content of the homepage.

By the way, I'm not actually seeing this bug. I implemented the above workaround because I can't afford to have my homepage go down even for a few minutes, and its content very rarely changes anyway.

opengeek commented 13 years ago

opengeek submitted:

Alright, I believe this mystery has now been solved. The problem turns out to be that the process of caching Resources is not properly checking to make sure the Resource wasn't loaded from the cache before caching it. This means the cache file was being re-written unnecessarily even when it was successfully read from the cache. This easily triggers race conditions since the file is being written by almost every request for a specific Resource, especially the site_start, and especially if used as the error_page as well.

You can see commit details for the fix applied to 2.0.7-pl at https://github.com/modxcms/revolution/commit/82ad456dcc67a11255b1c539f570362525f3f992 and a 2.0.8-pl will be released shortly to address this critical production bug.

If you are experiencing this bug and can apply this fix manually to confirm it does resolve the problem, please do and report back in this ticket.

opengeek commented 13 years ago

opengeek submitted:

Marking this resolved, and this is addressed in 2.0.8-pl—I have not gotten any feedback on the problem still occurring. Will not close until 2.1.0-rc-1 is released however.

meezyart commented 12 years ago

meezyart submitted:

I would just like to report that as of late I have been experiencing this issue as well. I"m running the newest version of modx and the client is at his wits end. is there a firm solution for this before we lose a client.

MODX Revolution 2.2.0-pl2 (traditional) php version: 5.2.17

white screen on the start page and occasionally the other pages to.

kenquad commented 12 years ago

kenquad submitted:

Have you tried disabling cache sitewide as a stopgap fix?

opengeek commented 12 years ago

opengeek submitted:

Please, DO NOT update the target version on CLOSED tickets. This bug was resolved in MODX 2.1. If you have a bug that you think exhibits similar behavior to this ticket, enter a new ticket and reference the closed ticket.