terrajobst / nquery-vnext

A Roslyn inspired rewrite of NQuery
MIT License
72 stars 16 forks source link

Fix NOT IN( ... ) #25

Closed StevenThuriot closed 5 years ago

StevenThuriot commented 6 years ago

[NOT] IN (...) syntax always translates to an Equality expression, while NOT IN(...) should actually translate to a NotEqual one.

terrajobst commented 5 years ago

@StevenThuriot

Nice! I've rebased the branch to get rid of the little detour; I'll add a test and then it's ready to be merged.

dallmair commented 5 years ago

Wait a sec, IIRC I played a bit with this and found it's not entirely correct. Didn't fully finish my analysis though, therefore I didn't comment here yet. Let me dig up my notes in the next days, hopefully I'll be able to spend a few minutes tomorrow.

dallmair commented 5 years ago

Nevertheless very well spotted @StevenThuriot, Thanks! Although I ran plenty of queries with vNext I didn't find this bug...

terrajobst commented 5 years ago

@dallmair

Wait a sec, IIRC I played a bit with this and found it's not entirely correct.

You mean this PR or yours? I'e spent some time yesterday and I think there are issues with it. I'll comment on it.

dallmair commented 5 years ago

You mean this PR or yours? I'e spent some time yesterday and I think there are issues with it. I'll comment on it.

Oh, mine (#27) is not perfect for sure, as it tends towards less optimization in favor of correctness, so I'm sure there's a better solution as you already wrote over there. But this one is about NOT IN.

Here are my notes per test cases [1]. Reasoning: If there is one place where NOT is implemented incorrectly, there might be several :) Therefore I've tested these cases:

So the good news is that this PR is fine and can (and should!) be merged. It does not fix all broken cases of NOT IN (…), though.

[1]

SELECT  *
FROM    Employees e
WHERE   e.ReportsTo /*NOT*/ BETWEEN 1 AND 3

SELECT  *
FROM    Employees e
WHERE   e.ReportsTo IS /*NOT*/ NULL

SELECT  *
FROM    Customers c
WHERE   c.Region /*NOT*/ LIKE 'S%'

SELECT  *
FROM    Customers c
WHERE   c.Region /*NOT*/ SIMILAR TO 'S'

SELECT  *
FROM    Customers c
WHERE   c.Region /*NOT*/ SOUNDS LIKE 'SP'

SELECT  *
FROM    Customers c
WHERE   c.Region /*NOT*/ IN ('BC', 'SP')

SELECT  *
FROM    Employees e
WHERE   e.ReportsTo /*NOT*/ IN (SELECT RegionID FROM Region)
dallmair commented 5 years ago

@terrajobst: The test you wrote focuses on testing the pure negation, which is exactly what was simply missing in the implementation here. The queries I used also ensure the negation is evaluated correctly when there are NULL values in the filter column, which is the problem in the NOT IN (SELECT …) case.

terrajobst commented 5 years ago
  • (NOT) IN query expression -> still broken, not addressed by this PR

Can you point out the test case that is broken? I've added tests to validate that [NOT] IN <query> also works. I've also (manually) validated your example:

SELECT  e.EmployeeId
FROM    Employees e
WHERE   e.ReportsTo /*NOT*/ IN (SELECT RegionID FROM Region)

It returns { 1, 3, 4, 5, 8 }. Adding a NOT it returns { 2, 6, 7, 9 } which seems correct. It also looks like the binder is adding the negation when it lowers IN <query> to ALL|ANY <query>.

Am I missing something?

terrajobst commented 5 years ago

I wrote:

Am I missing something?

Yes, as @dallmair said above:

The queries I used also ensure the negation is evaluated correctly when there are NULL values in the filter column, which is the problem in the NOT IN (SELECT …) case.

I've pushed 8464154 that will fix this.

dallmair commented 5 years ago

Perfect, Thanks!