benbjohnson / litestream

Streaming replication for SQLite.
https://litestream.io
Apache License 2.0
11.1k stars 256 forks source link

Crash on writing (deleting?) snapshot #558

Open liffiton opened 10 months ago

liffiton commented 10 months ago

I've been running Litestream for months with no change to the configuration and no issues, when a few days ago it crashed on a normal snapshot operation. It had just written a new snapshot, and it looks like something went wrong when deleting a previous one. I'm using v0.3.13 and writing to Backblaze B2 with the S3 API.

litestream[411]: time=2024-01-04T19:25:23.637-06:00 level=INFO msg="write snapshot" db=<my DB file> replica=s3 position=5ef78de18fe20421/00004a76:4152
litestream[411]: time=2024-01-04T19:25:25.204-06:00 level=INFO msg="snapshot written" db=<my DB file> replica=s3 position=5ef78de18fe20421/00004a76:4152 elapsed=1.560554481s sz=23182400
litestream[411]: panic: runtime error: invalid memory address or nil pointer dereference
litestream[411]: [signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0xea55a3]
litestream[411]: goroutine 36 [running]:
litestream[411]: github.com/benbjohnson/litestream/s3.deleteOutputError(0xc00042c940)
litestream[411]:         /home/runner/work/litestream/litestream/s3/replica_client.go:766 +0x63
litestream[411]: github.com/benbjohnson/litestream/s3.(*ReplicaClient).DeleteSnapshot(0xc00062b100, {0x17de4f8, 0xc00013e190}, {0xc00054b5af, 0x10}, 0xc000763fc0?)
litestream[411]:         /home/runner/work/litestream/litestream/s3/replica_client.go:318 +0x29a
litestream[411]: github.com/benbjohnson/litestream.(*Replica).deleteSnapshotsBeforeIndex(0xc0005de0f0, {0x17de4f8, 0xc00013e190}, {0xc00054b5af, 0x10}, 0x4a76)
litestream[411]:         /home/runner/work/litestream/litestream/replica.go:627 +0x225
litestream[411]: github.com/benbjohnson/litestream.(*Replica).EnforceRetention(0xc0005de0f0, {0x17de4f8, 0xc00013e190})
litestream[411]:         /home/runner/work/litestream/litestream/replica.go:604 +0x38b
litestream[411]: github.com/benbjohnson/litestream.(*Replica).retainer(0xc0005de0f0, {0x17de4f8, 0xc00013e190})
litestream[411]:         /home/runner/work/litestream/litestream/replica.go:726 +0x138
litestream[411]: github.com/benbjohnson/litestream.(*Replica).Start.func2()
litestream[411]:         /home/runner/work/litestream/litestream/replica.go:128 +0x57
litestream[411]: created by github.com/benbjohnson/litestream.(*Replica).Start in goroutine 30
litestream[411]:         /home/runner/work/litestream/litestream/replica.go:128 +0x165
hifi commented 10 months ago

Hi, #557 fixes this crash. I'll merge it soon but can't promise a release date.

It seems B2 broke their API and they return an error complaining about malformed XML randomly to delete operations made by Litestream.

That PR doesn't fix the issue B2 is having but it does prevent the crash and Litestream should eventually succeed and only show an error in logs.