Open davidfowl opened 5 months ago
It's a little more complex than, as you need to add the vector
extension to the database that you're going to be storing the vectors in.
I believe that eShop does it here: https://github.com/dotnet/eShop/blob/72445fd19c85e46420a4ae4c33eb46245c970c9b/src/Catalog.API/Infrastructure/Migrations/20231009153249_Initial.cs#L15
But if you're using a non-ef database connector, say the memory pipeline in Semantic Kernel, you need to add the extension another way. My approach is to bind mount a SQL script into the init path of the Postgres container (see this code).
From a deployment perspective, we need to then ensure that we're surfacing up the metadata required for the IaC provider, with Azure and AWS we need to enable the right feature on the DB resource being deployed rather than using the container image.
Isn't that automagic when you use https://github.com/dotnet/eShop/blob/72445fd19c85e46420a4ae4c33eb46245c970c9b/src/Catalog.API/Extensions/Extensions.cs#L11?
From a deployment perspective, we need to then ensure that we're surfacing up the metadata required for the IaC provider, with Azure and AWS we need to enable the right feature on the DB resource being deployed rather than using the container image.
That should be as simple as representing it in the model is some first class way. The azure/aws postgres extension method should be able to look at the model the glean that it needs the pgvector extension (I'm not sure we need to make this super generic until we have more concrete use cases).
No, that extension method is used to enable the Vector extension in the npgsql driver for .NET - https://github.com/pgvector/pgvector-dotnet/blob/master/src/Pgvector/Npgsql/VectorExtensions.cs#L10
You still need to install the extension in the Postgres server - locally we have to use the pgvector Docker image, whereas databases like Azure PostgreSQL have the extension installed, then we have to enable it on the database using the CREATE EXTENSION vector;
(since you have to opt-in which database(s) on the server have the extension supported).
I do this in my apps using a SQL script (https://github.com/aaronpowell/HanselminutesBot/blob/main/HanselminutesBot.AppHost/database/init.sql) that is added to the DB container image (https://github.com/aaronpowell/HanselminutesBot/blob/main/HanselminutesBot.AppHost/Program.cs#L45) as part of its startup pipeline.
So this was the closest issue I could find to it, but it looks like we also need the ability to set Microsoft.DBforPostgreSQL/flexibleServers/configurations
as a part of the postgres deployment for many extensions (ex. citext) otherwise you can't use CREATE EXTENSION
at all.
Is there any way to set this property currently?
EF Core can already do CREATE EXTENSION
as a part of a migration with ngpsql, but you still have to enable it on the extensions on the DB server in azure.
More info on the azure postgres docs here: https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/concepts-extensions#how-to-use-postgresql-extensions
Yep, you need to have some additional Bicep run to enable the extension. I've got a sample of it, just need to clean it up a touch to contribute in.
Not sure we want additonal bicep, I'd rather us expand the Azure.Provisioning libraries to support setting this .
Not sure we want additonal bicep, I'd rather us expand the Azure.Provisioning libraries to support setting this .
Yes, that might be a better option, but presently (or at least, 4 weeks ago when I looked before going on leave 🤣) there isn't support for that in the SDK to do it.
PR open on CDK to add support - https://github.com/Azure/azure-sdk-for-net/pull/44315
PGVector is a popular extension for exposing vector search capabilities in a postgres database. Today eShop hack around this by using code like this
We should build a more first class API for adding/enabling PGVector on the postgres resource.