marijnh / Postmodern

A Common Lisp PostgreSQL programming interface
http://marijnhaverbeke.nl/postmodern
Other
392 stars 90 forks source link

Ideas / feature requests #327

Closed ankane closed 6 months ago

ankane commented 7 months ago

Hi, big thanks for this project! I just started learning Lisp and it was super easy to use.

I put together an example for pgvector (an extension that adds vector similarity search to Postgres) and had a few ideas for features:

  1. A load-extension function - (:load-extension 'vector), similar to load-uuid-extension, which runs CREATE EXTENSION IF NOT EXISTS "vector"
  2. create-table support for custom column types with a typmod, like vector(3) (you can currently use (varchar 3) to generate VARCHAR(3), but I couldn't find a way to make it work with a custom type)
  3. create-index support for opclasses - CREATE INDEX ON items USING gin (metadata jsonb_path_ops)
  4. create-index support for storage parameters - CREATE INDEX ON items (name) WITH (fillfactor=50)
  5. Include distance operators by default, like <->, <#>, and <=> (PostGIS, cube) - this is currently possible with (register-sql-operators :2+-ary :<-> :<#> :<=>)
sabracrolleton commented 7 months ago
  1. Agree we should have a generic load-extension function. It will go into the next update.
  2. With respect to custom column types in create-table, Are you are talking about the create-table function that looks at a dao-class for definitions or the :create-table s-sql operator? I will see if I have time to look at it next weekend. 3,4 added create-index support should be relatively straight forward.
  3. Re distance operators. What do you think should happen if PostGIS or cube are not loaded?
ankane commented 7 months ago

Hi @sabracrolleton, thanks for the response!

For 2, the s-sql operator. I'm essentially trying to do something like:

(:create-table 'items
    ((id :type bigserial :primary-key t)
    (embedding :type (vector 3))))

but it currently generates VECTOR instead of VECTOR(3).

For the distance operators, I think it's fine to generate the SQL whether or not they're installed (which is currently what happens now for PostGIS and hstore operators).

sabracrolleton commented 7 months ago

For 2, what you are looking for is adding additional methods to the generic method sql-type-name in the s-sql.lisp file.

I am reluctant to call the type VECTOR (would prefer PGVECTOR) because of the potential for confusing this with a CL vector (a one dimensional array).

ankane commented 7 months ago

I don't think there should be anything specific to pgvector, but it'd be nice to support custom Postgres types (like those from extensions - for instance FOO(3)).

sabracrolleton commented 7 months ago

That is the advantage of having s-sql:sql-type-name as a generic function (and the reason it is exported). In your own code, you can add an appropriate method for your custom type. In this example I will just define a method for pgvector with a fallback default:

(defmethod s-sql:sql-type-name ((lisp-type (eql 'pgvector)) &rest args)
                 (cond (args (format nil "VECTOR(~A)" (car args)))
                       (t "VECTOR(1)")))

This will then allow you to call :create-table with the appropriate result. The following example uses pomo:sql to show what the sql statement will look like:

(pomo:sql (:create-table 'items ((id :type bigserial :primary-key t)
                                 (embedding :type (pgvector 3)))))
"CREATE TABLE items (id BIGSERIAL NOT NULL PRIMARY KEY , embedding VECTOR(3) NOT NULL)"

Just to show the fallback situation, if for some reason you did not provide the numeric parameter:

(pomo:sql (:create-table 'items ((id :type bigserial :primary-key t)
                                 (embedding :type (pgvector)))))
"CREATE TABLE items (id BIGSERIAL NOT NULL PRIMARY KEY , embedding VECTOR(1) NOT NULL)"

You can create new methods for sql-type-name for as complicated custom types as you desire, you just need to decide what the format string should look like. Does that get you what you want?

ankane commented 7 months ago

Thanks @sabracrolleton, that seems to work. I think it'd nice for the generic function to do it by default (treating additional arguments as a typmod rather than dropping them).

(colname :type foo)       -> FOO
(colname :type (foo 3))   -> FOO(3)
(colname :type (foo 2 3)) -> FOO(2,3)
sabracrolleton commented 7 months ago

That seems reasonable. I am trying to remember why it was not done the current way in the first place.

sabracrolleton commented 7 months ago

Looking back at the code, sql-type-name is indirectly used in a couple of different contexts, one of which can conflict with your request on the default method. I will need to take more time testing to ensure I do not create more problems than I solve. The rest of your enhancement requests I should be able to get done this weekend.

ankane commented 7 months ago

Awesome, thanks @sabracrolleton!

sabracrolleton commented 6 months ago

Progress report. Edited: Will be done tomorrow.

sabracrolleton commented 6 months ago

Committed

ankane commented 6 months ago

Awesome, big thanks @sabracrolleton!

For 1, it looks like load-extension was added to the docs, but not the code. Submitted a PR in #330.

sabracrolleton commented 6 months ago

Arrgh. That was embarrassing.