Check Redis configuration

reefdog commented 4 years ago

We get a lot of the following error while the scrapers are running:

[0] BRPOPLPUSH ReplyError: MISCONF Redis is configured to save RDB snapshots, but it is currently not able to persist on disk. Commands that may modify the data set are disabled, because this instance is configured to report errors during writes if RDB snapshotting fails (stop-writes-on-bgsave-error option). Please check the Redis logs for details about the RDB error.
[0]     at parseError (/data/repos/tech-and-check-alerts/node_modules/ioredis/node_modules/redis-parser/lib/parser.js:179:12)
[0]     at parseType (/data/repos/tech-and-check-alerts/node_modules/ioredis/node_modules/redis-parser/lib/parser.js:302:14) {
[0]   command: {
[0]     name: 'brpoplpush',
[0]     args: [
[0]       'bull:helloWorldCrawler:wait',
[0]       'bull:helloWorldCrawler:active',
[0]       '5'
[0]     ]
[0]   }
[0] }

Unsure if this is a permissions error, some sort of race condition, or… something else, but we should look into it!

reefdog commented 4 years ago

Handling this now because Redis is the primary culprit for our logs filling up crazyfast these days. We do want to keep dumping Redis if possible, to preserve and resume state upon system reboot.

First I verified that Redis is able to write to /var/lib/redis/dump.rdb at all:

$ ls -la /var/lib/redis/
total 260388
drwxr-x---  2 redis redis      4096 Sep 12 14:43 .
drwxr-xr-x 38 root  root       4096 Aug 27 19:35 ..
-rw-rw----  1 redis redis 266628725 Sep 12 14:43 dump.rdb

Then, I checked the Redis server logfile (/var/log/redis/redis-server.log), which showed this at one point:

31564:M 11 Sep 09:56:12.045 * 10 changes in 300 seconds. Saving...
31564:M 11 Sep 09:56:12.062 * Background saving started by pid 1297
1297:C 11 Sep 09:56:16.369 * DB saved on disk
1297:C 11 Sep 09:56:16.384 * RDB: 1 MB of memory used by copy-on-write
31564:M 11 Sep 09:56:16.474 * Background saving terminated with success
31564:M 11 Sep 10:01:17.018 * 10 changes in 300 seconds. Saving...
31564:M 11 Sep 10:01:17.018 # Can't save in background: fork: Cannot allocate memory
31564:M 11 Sep 10:01:23.024 * 10 changes in 300 seconds. Saving...
31564:M 11 Sep 10:01:23.024 # Can't save in background: fork: Cannot allocate memory
31564:M 11 Sep 10:01:29.024 * 10 changes in 300 seconds. Saving...
31564:M 11 Sep 10:01:29.024 # Can't save in background: fork: Cannot allocate memory
31564:M 11 Sep 10:01:35.024 * 10 changes in 300 seconds. Saving...
31564:M 11 Sep 10:01:35.024 # Can't save in background: fork: Cannot allocate memory
31564:M 11 Sep 10:01:41.024 * 10 changes in 300 seconds. Saving...
31564:M 11 Sep 10:01:41.024 # Can't save in background: fork: Cannot allocate memory
31564:M 11 Sep 10:01:47.025 * 10 changes in 300 seconds. Saving...
31564:M 11 Sep 10:01:47.025 # Can't save in background: fork: Cannot allocate memory
31564:M 11 Sep 10:01:53.025 * 10 changes in 300 seconds. Saving...
31564:M 11 Sep 10:01:53.025 # Can't save in background: fork: Cannot allocate memory
31564:M 11 Sep 10:01:59.025 * 10 changes in 300 seconds. Saving...
31564:M 11 Sep 10:01:59.025 # Can't save in background: fork: Cannot allocate memory
31564:M 11 Sep 10:02:05.025 * 10 changes in 300 seconds. Saving...
31564:M 11 Sep 10:02:05.025 # Can't save in background: fork: Cannot allocate memory
31564:M 11 Sep 10:02:11.025 * 10 changes in 300 seconds. Saving...
31564:M 11 Sep 10:02:11.025 # Can't save in background: fork: Cannot allocate memory
31564:M 11 Sep 10:02:17.025 * 10 changes in 300 seconds. Saving...
31564:M 11 Sep 10:02:17.047 * Background saving started by pid 1327
1327:C 11 Sep 10:02:23.437 * DB saved on disk
1327:C 11 Sep 10:02:23.465 * RDB: 1 MB of memory used by copy-on-write
31564:M 11 Sep 10:02:23.548 * Background saving terminated with success

Our current configuration for dumping is:

save 900 1 # 1 change in 900 seconds
save 300 10 # 10 changes in 300 seconds
save 60 10000 # 10,000 changes in 60 seconds

I also found this very relevant SO discussion.

So here's what's happening: during scrapes, Redis's memory usage balloons. We try to dump it to disk because it passes one of those save thresholds, but by default the system tries to allocate the same amount of memory Redis is already using yet again. Since this footprint is well more than we've usually got available, the write fails.

Evidently the most correct answer is to modify /etc/sysctl.conf with vm.overcommit_memory=1, which basically will let Redis start dumping without pre-allocating all the memory.

It's additionally possible that during scrapes, the dumpfile gets very large, and running out of disk space on the root drive (especially as we were filling up the root directory with noisy logs from Redis complaints). I haven't seen logs to that effect — and fixing the complaints will basically keep the logs from ballooning in the first place — but it doesn't hurt to have Redis dump to the big attached drive with more headroom.

Lastly, by default Redis is configured with stop-writes-on-bgsave-error yes, which causes it to stop writing entirely if the dumps stop working. The documented reason is because they want to escalate "we can't back things up to disk!" to the point of causing a very noticeable error to the user, but that only makes sense in environments where Redis's persistence is mission-critical. We can recover, so it makes more sense to let Redis continue writing to memory even if the dumping to disk fails.

So, the changes I'm going to make:

[x] Configure Redis to dump to the attached storage so we don't hit root drive limits
[x] Configure Redis to keep writing (to its in-memory store) even if the save to disk fails
[x] Configure sysctl to overcommit memory to relax the memory allocation so we don't run out of memory

reefdog commented 4 years ago

By the way! We should at least have a 5m talk about our persistence method. Right now we're using RDB, which saves to disk every n seconds and writes. This means smaller dumpfiles and less disk thrash, but failures in between saves mean we lose all writes since the last save.

The alternative is AOF, which appends to disk on every write. This ensures no more than one lost write (before failing out and stopping writes, I guess) and so far more data resilience, but the disk size of these files can get out of control without careful monitoring and cleaning up or whatever.

We're in a weird in-between state where we don't require the resilience of AOF, and yet our writes come in such a short window (generally one or two minutes of scraping each hour, then silence for 58 minutes) that the lower latency of AOF may be desirable.

Perhaps we just fine-tune the RDB settings? (For now, we're just going to go with default RDBs and have this convo eventually.)

reefdog commented 4 years ago

Okay, the above changes have been made. Redis is dumping to the attached storage and will continue running even if doing so fails, and the OS now won't pre-allocate "the size of Redis's current store" in RAM every time we try to dump. (The latter meant that when trying to dump a 1GB Redis store to disk, the system tried to allocate another 1GB on top of that used by Redis already, which is evidently an unnecessary precaution.)

We'll watch the logs and see if they quiet down after all this!

reefdog commented 4 years ago

Hrm. Seeing a whole lot of this in /var/log/redis/redis-server.log since yesterday's changes:

23372:M 13 Sep 16:11:29.046 * 1 changes in 900 seconds. Saving...
23372:M 13 Sep 16:11:29.065 * Background saving started by pid 11917
11917:C 13 Sep 16:11:29.065 # Failed opening the RDB file dump.rdb (in server root dir /data/redis) for saving: Read-only file system
23372:M 13 Sep 16:11:29.166 # Background saving error

That's… not true. It's not a read-only file system, and both /data/redis and /data/redis/dump.rdb are owned by redis:redis. The directory is drwxrwxr-x (775, I think) and the file is -rw-rw---- (660, I think), matching their original permissions. I'll look into it…

reefdog commented 4 years ago

Figured it out, thanks to SO. Since I changed the default Redis output directory, I needed to give the Redis service system-controlled permissions to write to this directory. What a strange stricture.

reefdog commented 4 years ago

An interesting warning from the redis startup:

12438:M 13 Sep 16:23:42.836 # WARNING you have Transparent Huge Pages (THP) support enabled in your kernel. This will create latency and memory usage issues with Redis. To fix this issue run the command 'echo never > /sys/kernel/mm/transparent_hugepage/enabled' as root, and add it to your /etc/rc.local in order to retain the setting after a reboot. Redis must be restarted after THP is disabled.

reefdog commented 4 years ago

@slifty @cguess Do y'all think we can close this? It was basically a tracking issue for resolving our configuration-related Redis problems, and things seem to be trucking along. If we surface more problems, they should likely be new issues, yeah?

slifty commented 4 years ago

Yep sounds reasonable to me

TechAndCheck / tech-and-check-alerts

Check Redis configuration #232