asg017 / sqlite-vss

A SQLite extension for efficient vector search, based on Faiss!
MIT License
1.59k stars 59 forks source link

Hanging statements #67

Open polterguy opened 1 year ago

polterguy commented 1 year ago

In dotnet there's this thing called "Connection Pooling" that will reuse the same connection for multiple threads. It's an optimisation thing for ASP.NET and web apps. If I turn it on, I'm back to the "Cannot modify/delete function due to open statement" bug. I suspect there are still some statements that aren't correctly disposed in the lib. I can have a look at it unless you beat me to it ...

asg017 commented 1 year ago

Do you get the "due to open statement" error when loading the extension for the first time?

polterguy commented 1 year ago

I don't think so, but I am not sure. Here's the error message; "error during initialization: vector0: unable to delete/modify user-function due to active statements"

I suspect it's due to the table scan logic and the way it keeps an open statement during traversal. See my (new) pull request for details about my thesis here ...

polterguy commented 1 year ago

Actually, I just tested this, and you can simulate this thing by loading the extension twice.

This might not be possible for all I know in SQLite, and not something that's supposed to work - However, with DotNet there's this thing called "Connection Pooling" that allows for reusing the same connection, and then return it back to the connection pool.

Later when you need a new connection, it doesn't actually create a new connection, but returns an existing connection from the connection pool. Although I've seen comments about this being a bad idea for SQLite, this feature is turned on by default in DotNet, and implies everybody using this plugin needs to stop using connection pooling, experiencing weird hard to track down errors, not understanding what goes wrong, before they've turned OFF connection pooling.

Notice, there's no known method of determining if you're getting a new connection (at which point you have to load the plugin), or if you're getting a pooled connection (at which point loading the plugin again will fail).

I will spend some more time on investigating this, and see if I can figure it out - But I am not 100% sure it's an actual bug with your code ...

If there is a bug, it's probably due to something not being released, possibly an sqlite3_stmt or something similar - Which was one of my reasons for creating the SqlStatement struct ...

polterguy commented 1 year ago

My findings;

  1. There is no way to determine what extensions are loaded in SQLite
  2. There is no way to determine if you're getting a new connection or a pooled connection in DotNet
  3. Hence, it's an insolvable problem, and the only solution is to turn off connection pooling if using this this plugin with DotNet

It's an insolvable problem and can only be solved by turning off connection pooling in DotNet, which is done as follows;

Data Source=db_file.db; Pooling=False;

The above ensures you're getting a new connection every single time you create an SqliteConnection in DotNet.

Notice, using connection pooling with SQLite doesn't really make much sense either, and should probably be turned off for a whole range of other reasons too, so it's not really such a big loss ...

Psst, the code I've recently committed to my PR should work perfectly. But you might want to run it through your unit tests and sanity check things.