theseus-rs / postgresql-embedded

Embed PostgreSQL database
Apache License 2.0
43 stars 5 forks source link

Add support for pgvecto.rs and other extensions. #47

Open ShelbyJenkins opened 4 months ago

ShelbyJenkins commented 4 months ago

First of all, very nice project. I got it working when the other option I tried did not work.

I wouldn't ask for this feature request, but upon looking at the code for the archive module, I think that it would be fairly easy to download and install pg extensions.

This could be a solid use case for this project because it would allow you to easily embed a vectorDB in a rust app. Right now the rust options for vectorDBs are all server/client, and there are no options for embedded/in-process vectorDBs with rust. I know PG is not in-process, but what could be accomplished is a vectorDB that could be installed with a cargo, which would be close enough!

brianheineman commented 4 months ago

@ShelbyJenkins Thank you for suggesting this use case. I have been thinking about what the implications of adding extension support might look like for this project. The release archives for pgvecto.rs look like they would lend themselves to automated downloading and installing. This an area I would like to look into, but I cannot commit on if/when it may happen.

bobmcwhirter commented 3 months ago

+1, we'd also be interested in easy embedded with Trusted Language Extensions (AWS for rustpl etc) and Apache AGE (for graphy stuff)

dukeeagle commented 1 month ago

Would love to see this as well! @brianheineman - is this project accepting sponsorship? I would love to more formally support it, especially as you consider extensions like this.

brianheineman commented 1 month ago

Thanks for suggestion @dukeeagle; I went ahead and enabled sponsorship on my personal profile. I would like to support downloading/installing (bundling?) PostgreSQL plugins so they can easily be used in another project I've been working on; rsql. The PostgreSQL plugin/FDW ecosystem is pretty diverse with different build and distribution systems and I want to try to avoid taking on the effort of building/releasing every plugin. From what I have seen of other projects, they just build/bundle what they need and keep adding as they go. For this project, I would like to avoid having to build/support every plugin and instead provide a solution where the plugin artifacts can be pulled from another location similar to how the postgresql binaries are handled.

I'm interested in any feedback/comments/concerns folks have.

brianheineman commented 1 month ago

Some extensions require that PostgreSQL is compiled with specific compiler options (e.g. --with-gssapi) enabled. I just released 0.11.0 of the crate to support custom PostgreSQL archives. This will allow users create their own PostgreSQL binaries and include any required plugins in the archive. The release archive should be hosted on GitHub with the same name pattern and format as those provided by postgresql-binaries. The custom URL can be specified in Settings: https://github.com/theseus-rs/postgresql-embedded/blob/main/postgresql_embedded/src/settings.rs#L21-L22.

I am planning on continuing to look into support for downloading/installing custom extensions, but this new feature can be used in the interim (and may still be required for some extensions in the future).

spikecodes commented 2 days ago

I made a toy repo showing an implementation of a postgresql-embedded db with pgvecto.rs installed that passes various vector operation test cases: https://github.com/portalcorp/pgevdb

Hope this helps @ShelbyJenkins @bobmcwhirter @dukeeagle

brianheineman commented 1 day ago

@spikecodes thanks for providing an example for folks! In order to provide more transparency for this effort, I have created a draft PR #110 that I will update with changes as I go so folks can monitor progress and/or provide feedback.