dotnet / aspire

An opinionated, cloud ready stack for building observable, production ready, distributed applications in .NET
https://learn.microsoft.com/dotnet/aspire
MIT License
3.61k stars 401 forks source link

Add support for PgVector extension to postgres resource #3052

Open davidfowl opened 5 months ago

davidfowl commented 5 months ago

PGVector is a popular extension for exposing vector search capabilities in a postgres database. Today eShop hack around this by using code like this

We should build a more first class API for adding/enabling PGVector on the postgres resource.

aaronpowell commented 5 months ago

It's a little more complex than, as you need to add the vector extension to the database that you're going to be storing the vectors in.

I believe that eShop does it here: https://github.com/dotnet/eShop/blob/72445fd19c85e46420a4ae4c33eb46245c970c9b/src/Catalog.API/Infrastructure/Migrations/20231009153249_Initial.cs#L15

But if you're using a non-ef database connector, say the memory pipeline in Semantic Kernel, you need to add the extension another way. My approach is to bind mount a SQL script into the init path of the Postgres container (see this code).

From a deployment perspective, we need to then ensure that we're surfacing up the metadata required for the IaC provider, with Azure and AWS we need to enable the right feature on the DB resource being deployed rather than using the container image.

davidfowl commented 5 months ago

Isn't that automagic when you use https://github.com/dotnet/eShop/blob/72445fd19c85e46420a4ae4c33eb46245c970c9b/src/Catalog.API/Extensions/Extensions.cs#L11?

From a deployment perspective, we need to then ensure that we're surfacing up the metadata required for the IaC provider, with Azure and AWS we need to enable the right feature on the DB resource being deployed rather than using the container image.

That should be as simple as representing it in the model is some first class way. The azure/aws postgres extension method should be able to look at the model the glean that it needs the pgvector extension (I'm not sure we need to make this super generic until we have more concrete use cases).

aaronpowell commented 5 months ago

No, that extension method is used to enable the Vector extension in the npgsql driver for .NET - https://github.com/pgvector/pgvector-dotnet/blob/master/src/Pgvector/Npgsql/VectorExtensions.cs#L10

You still need to install the extension in the Postgres server - locally we have to use the pgvector Docker image, whereas databases like Azure PostgreSQL have the extension installed, then we have to enable it on the database using the CREATE EXTENSION vector; (since you have to opt-in which database(s) on the server have the extension supported).

I do this in my apps using a SQL script (https://github.com/aaronpowell/HanselminutesBot/blob/main/HanselminutesBot.AppHost/database/init.sql) that is added to the DB container image (https://github.com/aaronpowell/HanselminutesBot/blob/main/HanselminutesBot.AppHost/Program.cs#L45) as part of its startup pipeline.

AndrewBabbitt97 commented 4 months ago

So this was the closest issue I could find to it, but it looks like we also need the ability to set Microsoft.DBforPostgreSQL/flexibleServers/configurations as a part of the postgres deployment for many extensions (ex. citext) otherwise you can't use CREATE EXTENSION at all.

Is there any way to set this property currently?

EF Core can already do CREATE EXTENSION as a part of a migration with ngpsql, but you still have to enable it on the extensions on the DB server in azure.

AndrewBabbitt97 commented 4 months ago

More info on the azure postgres docs here: https://learn.microsoft.com/en-us/azure/postgresql/flexible-server/concepts-extensions#how-to-use-postgresql-extensions

aaronpowell commented 4 months ago

Yep, you need to have some additional Bicep run to enable the extension. I've got a sample of it, just need to clean it up a touch to contribute in.

davidfowl commented 4 months ago

Not sure we want additonal bicep, I'd rather us expand the Azure.Provisioning libraries to support setting this .

aaronpowell commented 4 months ago

Not sure we want additonal bicep, I'd rather us expand the Azure.Provisioning libraries to support setting this .

Yes, that might be a better option, but presently (or at least, 4 weeks ago when I looked before going on leave 🤣) there isn't support for that in the SDK to do it.

aaronpowell commented 1 month ago

PR open on CDK to add support - https://github.com/Azure/azure-sdk-for-net/pull/44315