VirtCode / serene-aur

replace your aur helper with a self-hosted build server
MIT License
5 stars 2 forks source link

error: error returned from database: (code: 5) database is locked: (code: 5) database is locked #3

Closed mariuszste closed 4 months ago

mariuszste commented 4 months ago

I did a pacman -Qmq | xargs -i serene add {} and started getting a bunch of database is locked errors. The build server has relatively slow hard drives and when it starts building packages the disk responsiveness drops drastically resulting in multiple threads trying to access the database.

in my opinion the database being locked should not result in an error but the program should wait for the pending queries to complete. Are you perhaps opening and closing the database on every connection? If so, consider opening the database once and then passing around the existing connection to avoid such errors.

mariuszste commented 4 months ago

and builds seem to be failing as well because of it [2024-05-13T12:37:45Z ERROR serene::build] build run for package failed extremely fatally: error returned from database: (code: 5) database is locked: (code: 5) database is locked

VirtCode commented 4 months ago

I thought that the server is using the database correctly, with a database pool provided by the library I'm using (sqlx). So I did some research and came across what seems to be the same issue on the repo of the library: https://github.com/launchbadge/sqlx/issues/451

So the problem seems to be caused by disk being too slow as far as I can tell, or something related to that. There are a couple of different fixes we could try. Sadly, I can't reproduce the issue on my end, I currently don't have any slow-enough hard drives lying around, so you'll have to test them.

The first thing we can try is whether setting the SQLite journaling mode to WAL is enough. I've pushed a fix onto main which enables that, as it should be enabled anyways for all connections as it offers more performance. So it would be great if you could pull the new image and test whether that already fixes the issues on your setup.

If that doesn't work (which may very well be as adding and building packages are mostly write operations, which are still not possible to run in parallel even with WAL), we'll probably have to resort to something more drastic, limiting the amount of connections to 1. That would then probably do the trick, but is not really desirable as it may impact performance quite a bit.

dumbasPL commented 4 months ago

I threw a bunch of packages at it again and it seems to be fixed or at least not as bad (0 failures so far and 0 jobs stuck on working). Adding WAL was a good idea either way.

In theory https://github.com/launchbadge/sqlx/issues/459 should fix this completely without affecting read performance but there doesn't seem to be any interest in implementing it any time soon.

VirtCode commented 4 months ago

Okay that's great, I'll close this then for the time being. If the error appears again, please reopen this issue, and we can consider using two pools or simply limit the connections to 1.