uwescience / raco

Compilation and rule-based optimization framework for relational algebra. Raco is the language, optimization, and query translation layer for the Myria project.
Other
72 stars 19 forks source link

bug in Datalog expressions #177

Open stechu opened 10 years ago

stechu commented 10 years ago
A(x) :- R(x,3), x>3+1.

will compile to a wrong RA plan and a wrong Myria plan. The expression 3+1 will be ignored. Is this by design or is this a bug?

dhalperi commented 10 years ago

In the future, you can go ahead and tag this type of issue a bug. You wrote a valid Datalog program by all reasonable assumptions.

  1. The x>3+1 clause is NOT ignored, the logical plan shows as: A = Project($0)[Select((($1 = 3) and ($0 > 3)))[Scan(public:adhoc:R)]]. The clause is parsed incorrectly.
  2. If we add parentheses to the clause (which ought to be valid):

    A(x) :- R(x,3), x > (3+1)

    then the clause is ignored: A = Project($0)[Select(($1 = 3))[Scan(public:adhoc:R)]]

Definitely a bug.

dhalperi commented 10 years ago

(Changed the name accordingly.)

dhalperi commented 10 years ago

I started branch fix-177 for this bug, and added a test-driven development commit that triggers it.

dhalperi commented 10 years ago

... and with a half-hour of poking at pyparsing in datalog/grammar.py, I was unable to figure out what the deal was. Looks like there are issues with groundcondition and condition related to recursive definitions (they don't recurse).

@billhowe wrote this code, maybe he has insight?

dhalperi commented 10 years ago

also fyi @7andrew7

dhalperi commented 10 years ago

FWIW, I think the Datalog language component of Raco is essentially abandoned. It has lots of these types of lurking bugs and I don't think anyone is working on them.

dansuciu commented 10 years ago

What’s Raco?

On Apr 28, 2014, at 9:06 AM, Daniel Halperin notifications@github.com wrote:

FWIW, I think the Datalog language component of Raco is essentially abandoned. It has lots of these types of lurking bugs and I don't think anyone is working on them.

— Reply to this email directly or view it on GitHub.

dhalperi commented 10 years ago

Raco is the name of the Python compiler for myria. — Sent from my phone

On Mon, Apr 28, 2014 at 9:29 AM, dansuciu notifications@github.com wrote:

What’s Raco? On Apr 28, 2014, at 9:06 AM, Daniel Halperin notifications@github.com wrote:

FWIW, I think the Datalog language component of Raco is essentially abandoned. It has lots of these types of lurking bugs and I don't think anyone is working on them.

— Reply to this email directly or view it on GitHub.


Reply to this email directly or view it on GitHub: https://github.com/uwescience/datalogcompiler/issues/177#issuecomment-41580354

domoritz commented 10 years ago

Raco is the Relational Algebra COmpiler. It's another name for the datalog/myrial/SQL compiler. We plan to rename this repository from datalogcompiler to raco to avoid confusion soon.

dhalperi commented 10 years ago

Relation Algebra COmpiler.)— Sent from my phone

On Mon, Apr 28, 2014 at 9:36 AM, Dominik Moritz notifications@github.com wrote:

Raco is the relational compiler. It's another name for the datalog/myrial/SQL compiler. We plan to rename this repository from datalogcompiler to raco to avoid confusion soon.

Reply to this email directly or view it on GitHub: https://github.com/uwescience/datalogcompiler/issues/177#issuecomment-41581277

billhowe commented 10 years ago

I think Brandon may have done a bit of noodling here.

But certainly the grammar (and model.py) needs an overhaul.

If we have datalog test cases that fail, I can try to take a crack at refreshing/refactoring to make them work.

On Monday, April 28, 2014, Daniel Halperin notifications@github.com wrote:

FWIW, I think the Datalog language component of Raco is essentially abandoned. It has lots of these types of lurking bugs and I don't think anyone is working on them.

— Reply to this email directly or view it on GitHubhttps://github.com/uwescience/datalogcompiler/issues/177#issuecomment-41577445 .

dhalperi commented 10 years ago

Yep- we need:

  1. A comprehensive suite of test cases aimed at identifying as many bugs as possible.
  2. Someone committed to taking charge of the datalog part of the code. — Sent from my phone

On Mon, Apr 28, 2014 at 9:41 AM, billhowe notifications@github.com wrote:

I think Brandon may have done a bit of noodling here. But certainly the grammar (and model.py) needs an overhaul. If we have datalog test cases that fail, I can try to take a crack at refreshing/refactoring to make them work. On Monday, April 28, 2014, Daniel Halperin notifications@github.com wrote:

FWIW, I think the Datalog language component of Raco is essentially abandoned. It has lots of these types of lurking bugs and I don't think anyone is working on them.

— Reply to this email directly or view it on GitHubhttps://github.com/uwescience/datalogcompiler/issues/177#issuecomment-41577445 .


Reply to this email directly or view it on GitHub: https://github.com/uwescience/datalogcompiler/issues/177#issuecomment-41581787

7andrew7 commented 10 years ago

A while back, I created some test infrastructure for datalog; see /raco/datalog/query_tests.py. Obviously, the test suite is incomplete. It would be valuable to add failing test cases even without finding the fix (appropriately labeled so as not to cause the test suite to fail). This is the "test-driven development" strategy Dan alluded to in his branch.

On Mon, Apr 28, 2014 at 9:43 AM, Daniel Halperin notifications@github.comwrote:

Yep- we need:

  1. A comprehensive suite of test cases aimed at identifying as many bugs as possible.
  2. Someone committed to taking charge of the datalog part of the code. — Sent from my phone

On Mon, Apr 28, 2014 at 9:41 AM, billhowe notifications@github.com wrote:

I think Brandon may have done a bit of noodling here. But certainly the grammar (and model.py) needs an overhaul. If we have datalog test cases that fail, I can try to take a crack at refreshing/refactoring to make them work. On Monday, April 28, 2014, Daniel Halperin notifications@github.com wrote:

FWIW, I think the Datalog language component of Raco is essentially abandoned. It has lots of these types of lurking bugs and I don't think anyone is working on them.

— Reply to this email directly or view it on GitHub< https://github.com/uwescience/datalogcompiler/issues/177#issuecomment-41577445>

.


Reply to this email directly or view it on GitHub:

https://github.com/uwescience/datalogcompiler/issues/177#issuecomment-41581787

— Reply to this email directly or view it on GitHubhttps://github.com/uwescience/datalogcompiler/issues/177#issuecomment-41582081 .

dhalperi commented 10 years ago

I added another test for #107 to the referenced branch. I guess that branch is now poorly-named, so I'll delete it and switch to fix-datalog-bugs.

dhalperi commented 10 years ago

Link: https://github.com/uwescience/datalogcompiler/tree/fix-datalog-bugs