Closed f2bo closed 1 month ago
@f2bo Thanks for reporting this issue! At the moment, I'm working on documentation for Microsoft Learn portal with more information about SQLite connector and its setup. Meanwhile, I will share this information here:
SQLite doesn't support vector search natively, that's why SQLite extension should be loaded first. As for now, we are using the following vector search extension: https://github.com/asg017/sqlite-vec. In the future, if SQLite will support vector search natively, we will transition to that, otherwise we will continue to use this extension and maybe we could add support for other vector search extensions if there will be such requests.
In order to add it to your application, you can use following link and download install.sh
file: https://github.com/asg017/sqlite-vec/releases/tag/v0.1.3. If you run this script, it will compile the extension and produce vec0.dll
file. You need to take this file and put it in the same folder where your project that's using this connector is located (e.g. MyProject\bin\Debug\net8.0
). After that, the extension should be loaded successfully and the functionality for vector search should work.
Thanks again!
@dmytrostruk Thank you for your reply.
Yes. I did look at the code and saw that it was using this extension. I understand that it will be necessary until SQLite supports vector search natively, but will the manual installation be required until then or is this just a temporary workaround? It's really not the best experience. Wouldn't it be possible to include the binary in the SQLite connector's package? Although, I suppose that it would mean having variants for each supported platform.
Other than the missing extension, there seems to be a second problem in that the connector never opens the SQLite connection. This can be seen with the following code that fails with InvalidOperationException: ExecuteNonQuery can only be called when the connection is open.
HostApplicationBuilder builder = Host.CreateApplicationBuilder();
builder.Services.AddSqliteVectorStore($"Data Source=test.db");
var host = builder.Build();
var store = host.Services.GetRequiredService<IVectorStore>();
var collection = store.GetCollection<string, TextChunk>("xxxx");
await collection.CreateCollectionIfNotExistsAsync();
Lastly, I noticed that the AddSqliteVectorStore
overload that obtains its connection from the DI container does not load the sqlite-vec extension. Is that intentional and it is assumed that it will be already configured properly?
Thanks!
It's really not the best experience. Wouldn't it be possible to include the binary in the SQLite connector's package? Although, I suppose that it would mean having variants for each supported platform.
@f2bo Yes, the installation process is not the best one, but we will need to think how to support it better in terms of installation, since the vector extension is written in C. We will also need to handle versioning and so on. As for now, it's recommended to install it on your side in a way that works best for you.
there seems to be a second problem in that the connector never opens the SQLite connection
Thank you for reporting this, I've created a PR with a fix: https://github.com/microsoft/semantic-kernel/pull/9407
Lastly, I noticed that the AddSqliteVectorStore overload that obtains its connection from the DI container does not load the sqlite-vec extension. Is that intentional and it is assumed that it will be already configured properly?
Correct, this approach gives you more flexibility when exactly in your application runtime you want to open/close connection, load an extension etc. So, you manage it on your side, and we will just use it. Otherwise, if it's not provided, we will initialize an instance of SqliteConnection
on our side, open a connection and load an extension for you. In PR I shared above, I also updated XML documentation with more details about this.
Thank you!
@dmytrostruk Thanks!
I was wondering if the requirement that the connection be already open could be removed and instead have the connector check its state and open it when necessary. Not a big deal but just one less thing that can go wrong when setting up the app.
I was wondering if the requirement that the connection be already open could be removed and instead have the connector check its state and open it when necessary. Not a big deal but just one less thing that can go wrong when setting up the app.
@f2bo In this case you could use a method overload that accepts a connection string, so the connection initialization and opening should be managed for you. I think we want to keep methods that pull SqliteConnection
from DI container and just use it as is without any modifications on connection instance. This will allow users to configure their SqliteConnection
in the way they want but still use DI to register VectorStore/Collection
classes.
@dmytrostruk My point was that if the connection retrieved from the DI container is closed, and you use it as is, this is guaranteed to fail, which is clearly not what was intended. It seems unnecessary given that the connector can look at the state of the connection and if it's closed, open it. I don't think that would break any expectations. Anyway, just wanted to make a suggestion. Feel free to close when the PR is ready.
My point was that if the connection retrieved from the DI container is closed, and you use it as is, this is guaranteed to fail, which is clearly not what was intended. It seems unnecessary given that the connector can look at the state of the connection and if it's closed, open it. I don't think that would break any expectations.
@f2bo I've added necessary logic to address this. Thanks for your feedback!
@dmytrostruk Good to hear! Thank you again.
Describe the bug I'm having trouble using the new SqliteVectorStore included in the latest release.
I first attempted using the
AddSqliteVectorStore(connectionString, options, serviceId)
overload, but it creates a newSqliteConnection
that it never opens and any subsequent operations on the vector store fail. I don't see any obvious way to open it after the fact.https://github.com/microsoft/semantic-kernel/blob/01c288a4be3c5a2f9c99454aee1035294bc6cb17/dotnet/src/Connectors/Connectors.Memory.Sqlite/SqliteServiceCollectionExtensions.cs#L49-L69
I then registered an already open
SqliteConnection
that it could use and added the vector store usingAddSqliteVectorStore(options, serviceId)
overload instead. However, any operation on the vector store fails with aVectorStoreOperationException - SqliteException: SQLite Error 1: 'no such module: vec0'
. It seems that it's using a SQLite module that is not included in theMicrosoft.Data.Sqlite
package.To Reproduce Steps to reproduce the behavior:
Expected behavior No exceptions.
Platform