Trivadis / plsql-cop-cli

db* CODECOP Command Line
Other
24 stars 1 forks source link

Parser support for shorthand operators (vector functionality) #34

Open rolandstirnimann opened 1 month ago

rolandstirnimann commented 1 month ago

The parser should support the shorthand operators, introduced with Oracle 23.4 (vector functionality). Currently, the code below leads to parse errors. As a workaround, the classical syntax can be used.

SELECT doc_id, chunk_id, chunk_data
  FROM doc_chunks
 ORDER BY chunk_embedding <-> :query_vector -- shorthand for EUCLIDEAN
 FETCH FIRST 4 ROWS ONLY;

SELECT doc_id, chunk_id, chunk_data
  FROM doc_chunks
 ORDER BY chunk_embedding <=> :query_vector -- shorthand for COSINE
 FETCH FIRST 4 ROWS ONLY;

SELECT doc_id, chunk_id, chunk_data
  FROM doc_chunks
 ORDER BY chunk_embedding <#> :query_vector -- SHORTHAND FOR DOT
 FETCH FIRST 4 ROWS ONLY; 
PhilippSalvisberg commented 1 month ago

Yes, these shorthand operators for distances were introduced in 23.4 and documented in May 2024. Therefore I consider this an enhancement request and not a bug.

Here are the examples using the "classical" syntax (functions instead of operators) that do not cause parse errors in version 5.0.1:

-- alternative for <->
select doc_id, chunk_id, chunk_data
  from doc_chunks
 order by vector_distance(chunk_embedding, :query_vector, euclidean)
fetch first 4 rows only;

-- alternative for <=>
select doc_id, chunk_id, chunk_data
  from doc_chunks
 order by vector_distance(chunk_embedding, :query_vector, cosine)
fetch first 4 rows only;

-- alternative for <#> operator
select doc_id, chunk_id, chunk_data
  from doc_chunks
 order by vector_distance(chunk_embedding, :query_vector, dot)
fetch first 4 rows only;