Describe the bug
pgvector VectorStore implementation incorrectly filters the result, resulting in no matches. Current implementation double-quotes json string values instead of single-quoting them.
To Reproduce
Steps to reproduce the behavior:
Enable query logging in postgres:
ALTER SYSTEM SET log_statement = 'all';
ALTER SYSTEM SET log_duration = on;
ALTER SYSTEM SET log_min_duration_statement = 0;
Reload configuration
SELECT pg_reload_conf();
2. Run similarity search via pgvector vectorstore
3. See that queries are rendered this way: `WHERE (data.cmetadata ->> 'doc_type') = '"earnings_transcript"'` instead of `WHERE (data.cmetadata ->> 'doc_type') = "earnings_transcript"`
4. Filtering won't return any result:
db=# WITH filtered_embedding_dims AS MATERIALIZED (
SELECT
FROM
vs_embeddings
WHERE
vector_dims(embedding) = '1536'
)
SELECT COUNT()
FROM
filtered_embedding_dims
JOIN vs_collections ON filtered_embedding_dims.collection_id = vs_collections.uuid
WHERE
vs_collections.name = 'langchain'
AND (filtered_embedding_dims.cmetadata ->> 'doc_type') = '"earnings_transcript"' ;
count
0
(1 row)
**Expected behavior**
db=# WITH filtered_embedding_dims AS MATERIALIZED (
SELECT
FROM
vs_embeddings
WHERE
vector_dims(embedding) = '1536'
)
SELECT COUNT()
FROM
filtered_embedding_dims
JOIN vs_collections ON filtered_embedding_dims.collection_id = vs_collections.uuid
WHERE
vs_collections.name = 'langchain'
AND (filtered_embedding_dims.cmetadata ->> 'doc_type') = 'earnings_transcript' ;
count
26
(1 row)
**Desktop (please complete the following information):**
- OS: OS X Ventura 13.6.9
- Version: langchain-rust = { version = "4.6.0", features = ["postgres"] }
Describe the bug pgvector VectorStore implementation incorrectly filters the result, resulting in no matches. Current implementation double-quotes json string values instead of single-quoting them.
To Reproduce Steps to reproduce the behavior:
Reload configuration
SELECT pg_reload_conf();
db=# WITH filtered_embedding_dims AS MATERIALIZED ( SELECT FROM vs_embeddings WHERE vector_dims(embedding) = '1536' ) SELECT COUNT() FROM filtered_embedding_dims JOIN vs_collections ON filtered_embedding_dims.collection_id = vs_collections.uuid WHERE vs_collections.name = 'langchain' AND (filtered_embedding_dims.cmetadata ->> 'doc_type') = '"earnings_transcript"' ; count
(1 row)
db=# WITH filtered_embedding_dims AS MATERIALIZED ( SELECT FROM vs_embeddings WHERE vector_dims(embedding) = '1536' ) SELECT COUNT() FROM filtered_embedding_dims JOIN vs_collections ON filtered_embedding_dims.collection_id = vs_collections.uuid WHERE vs_collections.name = 'langchain' AND (filtered_embedding_dims.cmetadata ->> 'doc_type') = 'earnings_transcript' ; count
(1 row)