ncruces / go-sqlite3

Go bindings to SQLite using wazero
https://pkg.go.dev/github.com/ncruces/go-sqlite3
MIT License
402 stars 12 forks source link

Examples and tests using `:memory:` instead of MemDB VFS #130

Closed daenney closed 1 month ago

daenney commented 1 month ago

I remember from one of our previous discussions that opening :memory: is recommended against by SQLite developers and people should instead use the MemDB VFS. However, there's still one example in this repo that uses this approach, as well as most tests. Since people sometimes refer to test suites as ways of understanding how certain functions should be used, I'm wondering if it's useful to convert those to use the MemDB VFS too? I'm happy to do the PR, just want to make sure you're on board with a change like that before I do anything.

The other thing I stumbled upon in https://www.sqlite.org/inmemorydb.html is the whole opening a database without a filename, in which case it becomes a temporary database in itself. Paired with PRAGMA temp_store = 3 it seems to be another option for a fully in-memory and temporary database. I tried this out with this library and it does seem to work. I can't deduce from the docs if this ends up being the same as calling open with :memory: instead, or if this is a third option. I imagine this isn't something people should use either?

ncruces commented 1 month ago

Yeah, this is hairy.

But the gist of it is: :memory: databases do not share any data across connections. So when you use :memory: you need to think about that.

These are fine:

This is not fine:

  1. using database/sql (sql.Open or driver.Open)
  2. writing to a :memory: database
  3. expecting writes to persist

It may work, but is at least flaky, because database/sql is a connection pool: each connection won't see the same data.

[!NOTE] Using an empty string to open a temporary file will run into the same problems as above, because a new file will be created for each connection in the pool.

That said :memory: databases are not recommended against by SQLite; shared cache mode is.

So what's recommended against is using file::memory:?cache=shared to "fix" :memory: databases. And one of the reasons I would personally recommended against it, is that transactions work differently in shared cache mode, so you may be testing the wrong behaviour.

ncruces commented 1 month ago

Going over this, again, I actually found a test where I'm making this mistake, which is a red flag.

But I'm still not so sure I want to change every usage of :memory:, since it works exactly as it should.

I should probably strongly consider changing every usage of it with database/sql, though.

ncruces commented 1 month ago

After basically doing https://github.com/ncruces/go-sqlite3/issues/130#issuecomment-2265848821, I realized that this raises other issues.

If you do file:/test.db?vfs=memdb your database becomes shared (obviously, that's the goal).

But this means your tests cannot run in parallel.

So, yeah, this is complicated. We need to figure out for each test if we want a shared database, or an isolated database, or a database that's local to the test case, but shared within that test case.

Reference counting the shared databases also induces another (super rare, probably never happens) flake: database/sql decides to close all connections; your db dies.

Finally I'll add that, often, within this package, I probably want :memory: because I want to test SQLite and the wrapper, not my VFS.

This probably merits its own util/helper function. I'll have think this through.

daenney commented 1 month ago

Aha, interesting. The way I interpreted some discussions on the SQLite forum seemed to suggest :memory: was an old thing for low-power devices at the time that nowadays people should probably not use irrespective of the shared cache.

If we're talking about test helpers to use MemDB but keep tests still able to run in parallel, how about something like:

func InMemoryTestDB(t *testing.T) string {
    t.Helper()
    r := rand.New(rand.NewPCG(0, 0))
    name := fmt.Sprintf("%s_%d", t.Name(), r.Int32())
    values := make(url.Values, 2)
    values.Add("vfs", "memdb")  // and anything else
    return "file:/" + name + "?" + values.Encode()
}

That should be random enough without needing to pull something like a uuid library in, and controlling the seed makes the source of randomness behave the same between test runs (though that might not be necessary).

ncruces commented 1 month ago

I wrote that same function yesterday. Thinking about where to put it, in a testing package or memdb.

daenney commented 1 month ago

It's not uncommon to have a testutil package (https://pkg.go.dev/search?q=testutil) or the like with helper functions that people can use when writing tests using a library. Sometimes it's just internal stuff, like AssertXYZ. But many also contain helpers that are meant to aid in writing tests for consumers of the libary (like creating an in-memory version of something).

ncruces commented 1 month ago

@daenney #131 is shaping up.

My TestDB is maybe weirdly implemented, which why it should be in a library (done once).

It's modeled after t.TempDir, same guarantees: same value returned for the same test, different for subtests, created and shared by all invocations, deleted when the test ends, concurrent safe, etc.