ncruces / go-sqlite3

Go bindings to SQLite using wazero
https://pkg.go.dev/github.com/ncruces/go-sqlite3
MIT License
402 stars 12 forks source link

Implement shared memory BSD locks #85

Closed ncruces closed 3 months ago

ncruces commented 4 months ago

Both the BSDs and illumos support shared memory, but not OFD locks.

Some form of in-memory locks would allow them to (safely) support WAL mode, as long we lock the database file exclusively to prevent other processes from accessing it.

ncruces commented 3 months ago

~Alternatively, POSIX advisory locks could be used for these.~

daenney commented 3 months ago

So far switching to this library has allowed users on Illumos to now run GoToSocial and since in general this library has a much easier time supporting all platforms the Go compiler supports we'd like to complete that transition.

Since this issue affects Illumos too I'm actually a bit worried that user might now have a problem as our default is db-sqlite-journal-mode: "WAL" in settings. Do we need to tell them to switch to TRUNCATE for the time being?

However, as noted in our previous discussion, we would like to keep the FreeBSD and OpenBSD ports working and we generally prefer to use WAL mode as it tends to be more performant in our case.

Could you elaborate a bit on what it would take to implement this? I wouldn't mind taking a stab at it, but I'm not entirely sure where to begin. An implementation built on Go's synchronisation primitives is probably the easiest on me personally, but I can look at other things too if you've got some pointers for me.

ncruces commented 3 months ago

There are a few sides to this.

It should be noted that, even if this is fixed, the performance, for most OSes/architechtures, won't be amazing. The wazero compiler is amd64/arm64 only, and Linux/macOS/FreeBSD/Windows only. Improving on that is out-of-scope here.

Switching to TRUNCATE is definitely the easiest solution today. Performance under load won't be great, though, as BSD locks have worse concurrency than OFD locks.


As for the actual issue, I'd like to solve this, but it's definitely challenging.

The major issue is how badly broken POSIX locks are. BSD locks have much better semantics, but don't support multiple locks per file. OFD locks are the best of both worlds.

For rollback journal, you can get away with a single lock per file, if you accept: (1) reduced concurrency, (2) writer starvation. For WAL mode, you really need multiple locks per file. And on most unixes, that's POSIX locks, period.

So, to synchronize with other processes, we need some kind of file locks, and to synchronize internally we need something in-memory. That's very close to what SQLite did, and it's a PITA to implement.

I've spent some time trying to find an angle that's easier but, TBH, haven't found any. Both my suggestions above quickly break down.

ncruces commented 3 months ago

Sleeping over it, I may have a solution that I'm not sure doesn't work. Writing this down so I don't forget. I'll take a look, time permitting.

The idea is to make a shm.go for flock.

shmMap must open the file, always O_RDWR. Then it goes through a list (a global variable), and checks if os.SameFile as an already opened file. If one matches, it uses that instead (increasing a ref count). If none matches, it locks the file exclusively, and does the DMS truncate branch. If it can't lock, returns BUSY; otherwise it carries on. The exclusive file lock is never released.

shmLock must "lock" on an object stored in the global list. Locks that may be shared must be ints, locks that are only locked exclusively can be bools; something like that. A mutex around access to these fields is simplest, and not a performance issue.

shmUnmap must release any held locks, decrease the ref count, and close/delete the file.

And vfsShm must also now have an explicit Close that also releases any held locks, and decreases the ref count (this accounts for a connection that's force closed).

This should be possible to test under Linux and macOS with the tag sqlite3_flock. All tests (including mptest) should reliably pass.

On macOS we must also test if the sqlite3 CLI can open a database that's opened like this: it shouldn't (on Linux we can't run this test). This is important to prevent corruption.

daenney commented 3 months ago

It should be noted that, even if this is fixed, the performance, for most OSes/architechtures, won't be amazing. The wazero compiler is amd64/arm64 only, and Linux/macOS/FreeBSD/Windows only. Improving on that is out-of-scope here.

I think that's fine for GoToSocial at least. We don't really expect anyone to be running this on something other than amd|arm64. Maybe risc64 at some point in the future. We only target Linux and the BSDs, anything beyond that is great if it works but not something we make promises about.