lcache / wp-lcache

A WordPress implementation of LCache
https://wordpress.org/plugins/wp-lcache/
GNU General Public License v2.0
88 stars 5 forks source link

`Error: One or more database tables are unavailable. The database may need to be repaired.` with `object-cache.php` enabled, no indication why #108

Closed lkraav closed 7 years ago

lkraav commented 7 years ago

Renaming object-cache.php resolves the issue.

Nothing is visible in error logs or mysql.err. How would you recommend I debug this?

I've had one issue with wp-lcache before, where had to fine tune innodb_data_file_path = ibdata1:10M:autoextend:max:512M where max:128M had to be increased to max:512M. Not sure if related.

Other subsites load fine.

EDIT restarting php-fpm resolved the deadlock. But why would this happen? Before restart:

$ [-] wp cache flush
Error: One or more database tables are unavailable. The database may need to be repaired.

EDIT I also now updated pecl-apcu-5.1.5 -> 5.1.7. Maybe some bug involved there. Will monitor for future re-occurrence.

danielbachhuber commented 7 years ago

Ah. WP LCache does use the database, so a crashed table may cause it to fail.

Glad you figured it out! If it becomes a repeat problem, let us know and we can look into how we can make it more fail safe.

lkraav commented 7 years ago

Ah. WP LCache does use the database, so a crashed table may cause it to fail.

But it wasn't a crashed table. All tables were fine. I simply restarted php-fpm which I believe clears apcu.

But of course, to pollute the data collection here... I may have emptied wp_lcache_* tables BEFORE the restart to see if that helps. It didn't. Then I restarted php-fpm. Which means that we can't really tell right now who got their state towards corruption, since wp-lcache didn't have anything to re-populate apcu with.

Anyway, awareness has been raised, maybe someone else will see something similar, or this will happen again. I have a feeling we'll be back here.

lkraav commented 7 years ago

Something is still going wrong here. WP-CLI crashes immediately regardless of restarts etc. mysqlcheck passes with flying colors for the whole database. As soon as object-cache.php is renamed, everything works.

Here's mysql query log of where we crash. There's a bunch of similar SELECTs on top, so I'm cutting the list to bottom five, and then there's that suspicious DELETE:

...
7 Query     SELECT "event_id", "pool", "address", "value", "created", "expiration" FROM wp_lcache_events WHERE "address" = '29:cr_1:lcache_alloptions_values:mainwp_db_version' AND ("expiration" >= 1481222716 OR "expiration" IS NULL) ORDER BY "event_id" DESC LIMIT 1
                    7 Query     SELECT "event_id", "pool", "address", "value", "created", "expiration" FROM wp_lcache_events WHERE "address" = '29:cr_1:lcache_alloptions_values:mainwp_extensions' AND ("expiration" >= 1481222716 OR "expiration" IS NULL) ORDER BY "event_id" DESC LIMIT 1
                    7 Query     SELECT "event_id", "pool", "address", "value", "created", "expiration" FROM wp_lcache_events WHERE "address" = '29:cr_1:lcache_alloptions_values:mainwp_manager_extensions' AND ("expiration" >= 1481222716 OR "expiration" IS NULL) ORDER BY "event_id" DESC LIMIT 1
                    7 Query     SELECT "event_id", "pool", "address", "value", "created", "expiration" FROM wp_lcache_events WHERE "address" = '29:cr_1:lcache_alloptions_values:mainwp_plugin_version' AND ("expiration" >= 1481222716 OR "expiration" IS NULL) ORDER BY "event_id" DESC LIMIT 1
                    7 Query     SELECT "event_id", "pool", "address", "value", "created", "expiration" FROM wp_lcache_events WHERE "address" = '29:cr_1:lcache_alloptions_values:mainwp_upgradeVersionInfo' AND ("expiration" >= 1481222716 OR "expiration" IS NULL) ORDER BY "event_id" DESC LIMIT 1
                    7 Query     INSERT INTO wp_lcache_events ("pool", "address", "value", "created", "expiration") VALUES ('5849aa3cec41f5.92269565:1790831999', '12:cr_1:default:is_blog_installed', 's:4:\"b:0;\";', 1481222716, NULL)
                    6 Query     DESCRIBE wp_users
                    7 Query     DELETE FROM wp_lcache_events WHERE "event_id" < '15826' AND "address" IN ('15:cr_site-options:1:ms_files_rewriting','12:cr_1:default:is_blog_installed')
                    7 Quit
                    6 Quit
161208 20:45:31     4 Query     SET global general_log = 0

EDIT actually, that INSERT looks malformed? ... OK, copy-pasting it from the log output, have to replace column name quotes with backticks.

lkraav commented 7 years ago

Solution may be wp core update-db --network here.

I was on beta releases for a while. strace actually points in the right direction by revealing open wp-includes/version.php syscall, but it took a while to decipher.

...
open("/home/warmpress/wp-includes/version.php", O_RDONLY) = 5
fstat(5, {st_mode=S_IFREG|0644, st_size=617, ...}) = 0
fstat(5, {st_mode=S_IFREG|0644, st_size=617, ...}) = 0
fstat(5, {st_mode=S_IFREG|0644, st_size=617, ...}) = 0
mmap(NULL, 617, PROT_READ, MAP_SHARED, 5, 0) = 0x7f2ec45d5000
munmap(0x7f2ec45d5000, 617)             = 0
close(5)                                = 0
lstat("/home/warmpress/wp-includes/l10n.php", {st_mode=S_IFREG|0644, st_size=43130, ...}) = 0
lstat("/home/warmpress/wp-includes/class-wp-locale.php", {st_mode=S_IFREG|0644, st_size=14739, ...}) = 0
lstat("/home/warmpress/wp-includes/class-wp-locale-switcher.php", {st_mode=S_IFREG|0644, st_size=5139, ...}) = 0
access("/home/warmpress/wp-content/db-error.php", F_OK) = -1 ENOENT (No such file or directory)
write(2, "\33[31;1mError:\33[0m One or more database tables are unavailable. The database may need to be repaired.\n", 101Error: One or more database tables are unavailable. The database may need to be repaired.
) = 101
...

It looks like everything sprang back to life. Exact underlying cause is still unknown to me, but maybe some database revision number was synced up somewhere using wp core update-db.

EDIT not even that simple, loading /wp-admin/ immediately regressed back to some cached invalid value. Had to remove wp-content/object-cache.php, re-run wp core update-db --network and NOW everything seems to be alive, since the right values maybe finally reached the db.

EDIT interestingly, several caching related problems reported with 4.7 https://make.wordpress.org/core/2016/12/08/dev-chat-summary-december-7th-4-7-launch-week/ I wonder if this belongs to the set.

danielbachhuber commented 7 years ago

Interesting.

Can you configure and test against a stock WordPress install on this machine? I'm curious as to whether there's a confluence of factors from other code, or whether this is an isolated issue with WP LCache.

Other subsites load fine.

Just to confirm, this issue is present across all sites on your network, right?