Tracking issue for schema

usamoi commented 9 months ago

View https://docs.pgvecto.rs/reference/schema.html to see schema provided by pgvecto.rs.

Functions

pgvecto.rs are lack of helper functions, and we need to provide more helper functions.

util functions
- [ ] function vector_dims(vector) RETURNS int, get dimensions of a vector #463 #485
- [ ] function normalize(vector) RETURNS vector, normalize a vector #485
- [ ] function vector_norm(vector) RETURNS real, $L^2$ norm of a vector #463 #485
- [ ] aggregate function avg(vector) and sum(vector) #463
casts
- [x] casts between vector -> vecf16: #262 #266
- [x] casts between vecf16 -> vector, vector -> svector, svector -> vector: #299
- ~casts between vecf16 and half precision[]: PostgreSQL do not support half precision[], we may make some decisions here~ not supported
embedding functions
- [x] openai: #350 tensorchord/pgvecto.rs-docs#56

Types

[ ] text representation of vector types, not documented
[ ] memory representation of vector types, not documented
[ ] binary representation of vector types, not documented
[ ] subscript behaviors of vector types, not documented

Schema

[x] install extension with schema #270

CI

[ ] check that you write a CREATE FUNCTION export_name SQL using the name _vectors_*_wrapper and the SQL is inconsistent with the signture defined in the code
[ ] check if a correct install script is located in ./src/install for a release

Contributing

If you want to export a user function to PostgreSQL:

use pgrx::pg_extern and name the function _vectors_*
write CREATE FUNCTION export_name in finalize.sql, you can use the name _vectors_*_wrapper here.

If you want to export a cast, operator or something else to PostgreSQL:

use pgrx::pg_extern and name the function _vectors_*
finish SQL in finalize.sql

VoVAllen commented 9 months ago

Test logical replication between different extension version

JoePassanante commented 9 months ago

Can you elaborate on

casts between vecf16 and half precision; PostgreSQL do not support half precision, we may make some decisions here

Is this an issue with postgres itself? or that it hasn't been implemented in pgvecto?

usamoi commented 9 months ago

Is this an issue with postgres itself?

Yes. PostgreSQL supports floating-type real (IEEE 754 binary 32) and floating type double precision (IEEE 754 binary 64), but there is not such a type named half precision (IEEE 754 binary 16). We probably cannot provide a cast between a vecf16 and an array. Another choice is providing a cast between vef16 and real[], which is "make some decisions here" refered.

JoePassanante commented 9 months ago

Is this an issue with postgres itself?

Yes. PostgreSQL supports floating-type real (IEEE 754 binary 32) and floating type double precision (IEEE 754 binary 64), but there is not such a type named half precision (IEEE 754 binary 16). We probably cannot provide a cast between a vecf16 and an array. Another choice is providing a cast between vef16 and real[], which is "make some decisions here" refered.

Messaged over in the discord for some help getting the environment setup. But isn't this as simple as:

#[pgrx::pg_extern(immutable, parallel_safe, strict)]
fn _vectors_cast_vecf32_to_vecf16(
    vector: Vecf32Input<'_>,
    _typmod: i32,
    _explicit: bool,
) -> Vecf16Output {
    let data = vector
        .data()
        .iter()
        .map(|x| x.to_f32())
        .map(|x| f16::from_f32(x))
        .collect();

    Vecf16::new_in_postgres(data)
}

CREATE CAST (vector AS vecf16)
    WITH FUNCTION _vectors_cast_vecf32_to_vecf16(vector, integer, boolean) AS IMPLICIT;

usamoi commented 9 months ago

Is this an issue with postgres itself?

Yes. PostgreSQL supports floating-type real (IEEE 754 binary 32) and floating type double precision (IEEE 754 binary 64), but there is not such a type named half precision (IEEE 754 binary 16). We probably cannot provide a cast between a vecf16 and an array. Another choice is providing a cast between vef16 and real[], which is "make some decisions here" refered.

Messaged over in the discord for some help getting the environment setup. But isn't this as simple as:
#[pgrx::pg_extern(immutable, parallel_safe, strict)]
fn _vectors_cast_vecf32_to_vecf16(
    vector: Vecf32Input<'_>,
    _typmod: i32,
    _explicit: bool,
) -> Vecf16Output {
    let data = vector
        .data()
        .iter()
        .map(|x| x.to_f32())
        .map(|x| f16::from_f32(x))
        .collect();

    Vecf16::new_in_postgres(data)
}

CREATE CAST (vector AS vecf16)
    WITH FUNCTION _vectors_cast_vecf32_to_vecf16(vector, integer, boolean) AS IMPLICIT;

Yes.

casts between vector and vecf16

casts between vecf16 and half precision[]

It's two issues. You can talk only about the first one since you only need the first one.

VoVAllen commented 9 months ago

I think float(2) is the same as fp16?

usamoi commented 9 months ago

I think float(2) is the same as fp16?

https://www.postgresql.org/docs/current/datatype-numeric.html#DATATYPE-FLOAT

I don't think so.

my-vegetable-has-exploded commented 6 months ago

Is there more specific definition of absolute-value norm?

usamoi commented 6 months ago

Is there more specific definition of absolute-value norm?

https://en.wikipedia.org/wiki/Norm_(mathematics)

my-vegetable-has-exploded commented 6 months ago

Is there more specific definition of absolute-value norm?

https://en.wikipedia.org/wiki/Norm_(mathematics)

I did not grasp the definition. How should we calculate it for a one-dimensional vector?

tensorchord / pgvecto.rs