prestodb / presto

The official home of the Presto distributed SQL query engine for big data
http://prestodb.io
Apache License 2.0
16.02k stars 5.36k forks source link

Fix EffectivePredicateExtractor to return actual predicates #8797

Closed sopel39 closed 5 years ago

sopel39 commented 7 years ago

Currently EPE returns predicates for projections that don't account for null, e.g: for Project a := b, EPE returns a = b (instead of a=b or a is null). This causes various problems (e.g: see https://github.com/prestodb/presto/commit/c18dd0b) and makes the logic tricky and delicate.

EPE is used along with EqualityInference which produces predicates based on EPE results. Since EPE doesn't really produce predicates (but rather equivalence classes for projections) EqualityInference conclusions are based on wrong assumptions.

The proposal is to fix EPE and append or is null for predicates that are derived from projections (and that can return null). This would remove need to https://github.com/prestodb/presto/commit/c18dd0b, which seems like a workaround for EPE issue.

What do you think @martint ?

FYI: @findepi @rschlussel

Related to https://github.com/prestodb/presto/issues/8796 since it touches same code.

rschlussel-zz commented 7 years ago
  1. the predicate should bea=b or a is null and b is null
  2. I tried doing this once, and it caused some problem because of EPE being used both for deriving equivalences and for producing predicates. Unfortunately I don't remember what didn't work about it, but my conclusion was to do it right we'd need to separate the two features.
sopel39 commented 7 years ago

Interesting, did it cause test failures?

sopel39 commented 7 years ago

Another approach would be to make EqualityInference produce actual predicates based on equivalences (e.g: a = b or (a is null and b is null) and just call EPE an equivalence extractor

rschlussel-zz commented 7 years ago

I believe there were failures, but i don't remember with certainty. I think the problem maybe related to visitProject(). I should have made a note somewhere with my findings.

sopel39 commented 7 years ago

Also, I wonder if https://github.com/prestodb/presto/commit/c18dd0b doesn't support more cases since it only checks if function can return null on non null input? Therefore it can correctly support predicates that are derived from projections.

stale[bot] commented 5 years ago

This issue has been automatically marked as stale because it has not had any activity in the last 2 years. If you feel that this issue is important, just comment and the stale tag will be removed; otherwise it will be closed in 7 days. This is an attempt to ensure that our open issues remain valuable and relevant so that we can keep track of what needs to be done and prioritize the right things.