Freaky / Compactor

A user interface for Windows 10 filesystem compression
MIT License
1.16k stars 47 forks source link

Corruption of SQLite Databases in Use #40

Closed A-H-M closed 3 years ago

A-H-M commented 3 years ago

Compacting SQLite databases while they are in use causes them to become corrupted. I've not checked if this is limited to SQLite but after compacting my system drive, all the programs I faced problems with had SQLite databases. These include Thunderbird, and Sticky Notes (From MS), among others. Simply excluding .sqlite or .db files and their corresponding -shm and -wal files isn't enough, as programs such as Chrome use SQLite DB files without any extensions.

Freaky commented 3 years ago

Well, that sucks. Sorry about that, hope you didn't lose anything important.

I can't seem to trigger it synthetically, constant compaction and uncompaction of a database being constantly written to and verified seems fine, so I'm not sure what the mechanism here is.

I can however reliably exclude them by acquiring an exclusive lock before compaction - hopefully this will be sufficient. I'll push out a 0.10 release with this soon.

Freaky commented 3 years ago

v0.10 pushed.

I still can't reproduce it either way, not with a Firefox profile, nor a quick test app that's constantly inserting and verifying, but locking is probably the right thing to do anyway.

A-H-M commented 3 years ago

Oh, no worries. Nothing critical, thankfully.

I'm unable to recreate the issue for some reason. However, all website login sessions being lost, my Chrome download manager extension losing all download history, Sticky Notes failing to open, Thunderbird redownloading all emails, and Networx (an internet data usage tracker) throwing malformed database errors, all just half a day after compacting the system drive seemed too far-fetched to be a coincidence, especially considering how the last two have been running without any issues for the past 3 and 6 years, respectively.

I've yet to find the extension's data directory, Sticky Notes' DB was empty, and Thunderbird has probably already replaced the files in question, but looking through Networx's database file showed a single unique constraint violation.

Trying to copy Sticky Notes' DB files also threw the following error related to the Windows Overlay Filesystem driver:

image

Checking the drive for errors came up with nothing, so I'm not sure of the cause of all of this.

I've been compacting files for about a year now, albeit with CompactGUI until less than a month ago, without any problems. I've noticed that both compact.exe and CompactGUI will not compress files that are being used by another process. As you mentioned, having Compactor do the same is probably for the best.

The feature set of it and performance is leaps and bounds beyond CompactGUI though, so thank you for the wonderful software and the incredibly quick response!

Freaky commented 3 years ago

Errors like that from the WOF driver are probably opaque to filesystem checking tools - it looks to be an issue with the metadata in the alternate data stream it uses for compressed storage, so it would be a bit like expecting chkdsk to notice a zip file you downloaded was corrupt. The filesystem's fine, the contents are just wrong for one reason or another.

I can guess as to why this sort of thing might happen with a file that's being modified while it's compressed, but would have expected it to be easier to reproduce, particularly since you encountered it across so many programs.

I'll continue to investigate, though I'm pretty sure the locking will prevent it in future.

A-H-M commented 3 years ago

The disk and SMART checking was mostly to rule out the possibility of the drive failing.

I'm also unsure of how to recreate the issue, but I'll try it with 0.10 and I'll mention if I come across any issues, though I doubt I will with the locking in place.

A-H-M commented 3 years ago

I tried compacting Thunderbird and Networx user data with 0.9 once more as a test, and did indeed come across file database corruption again. It didn't happen instantly, but took about half a day to a day before the problems manifested. Thunderbird redownloaded all emails and Networx threw SQLite disk i/o errors.

Trying the same with 0.10 correctly skips over the files in use and everything works correctly.

On a side note, files in use that are skipped over are being included in the amount compacted (i.e. The "Compacted x MiB in y files..." text). This means that in a folder where all files are in use, the program will report having compacted a certain number of files, but with zero bytes saved. I'm assuming this is unintended behavior, as excluded files are not being counted the same way.

That is a separate matter though, and this issue has been effectively resolved. As such, I'll be closing it. Thank you for the help.

Freaky commented 3 years ago

Thanks for double-checking my fix, greatly appreciated!

I see what you mean with the Compacted report - it's more counting the number of bytes that were in the files it tried to compress, so it's kind of right if you squint a bit.

This ties in with general file error handling - flashing up a message for a few milliseconds and dumping into the excluded list isn't really terribly informative, and ties in with #22 - it should be possible to see why a file was excluded. Was it locked, was permission denied, did it fail the compressibility check, is there an actual IO error, etc.

Something for the GUI rework.