partiql / partiql-lang

The PartiQL language specification
https://partiql.org/partiql-lang
Other
10 stars 1 forks source link

Section 9.2, scalar subquery coercion is incorrect for row value expressions. #5

Open RCHowell opened 1 year ago

RCHowell commented 1 year ago

Page 34 states,

SELECT VALUE {'foo': v.foo}
FROM anotherDataSet v
WHERE (v.a, v.b) = (SELECT VALUE [w.c, w,d]    -- call this `subq`
                    FROM someDataSet w
                    WHERE w.sth = v.sthelse)

The SELECT VALUE constructor is an array, which means the type of subq is <bag<list<w.c,w.d>>. This is incorrect and not SQL compatible. What this section should read is something along the lines of

> A subquery with the SQL-SELECT coerces into an array when it is the rhs (respectively, lhs) of 
a comparison operator whose other argument is an array.

The value is coerced to an array via COLL_TO_ROW which asserts the bag has one tuple and each binding tuple 
value becomes an array element in order via `{ a_0: v_0, ...., a:_n: v_n } -> [ v_0, ..., v_n ]`.  

We must also include static assertion of row value comparison compatibility.

Update

After implementing as a planning step, I have missed the importance of " (as opposed to a subquery starting with SELECT VALUE or PIVOT)" from section 9 which essentially says only coerce SELECT subqueries. For compatibility with SQL and row-value queries, I suggest the rewrite of COLL_SCALAR is generalized to

-- example 1, lhs is scalar

1 = (SELECT a FROM T)

1 = ONE(SELECT VALUE a FROM T)           -- this is different than normal SELECT to SELECT-VALUE rewrite

-- example 2, lhs is array

(1,2) = (SELECT a, b FROM T)             -- we can check degree before applying the rewrite

(1,2) = ONE(SELECT VALUE [a, b] FROM T)  -- 

(1,2) = ONE(SELECT * FROM T)             -- we are able to check degree here with the resolved plan

-- `ONE` returns the one and only element of the collection, throwing a runtime exception if more than one