cmu-db / peloton

The Self-Driving Database Management System
http://pelotondb.io
Apache License 2.0
2.03k stars 623 forks source link

Peloton crashes on invalid INSERT statements instead of throwing an error #829

Open tcm-marcel opened 7 years ago

tcm-marcel commented 7 years ago

The function planner::InsertPlan::InsertPlan checks the validity of INSERT statements only with assertions instead of throwing an exception that can be handled.

Example 1

CREATE TABLE A (val1 INT, val2 INT,  val3 INT,  val4 INT);
INSERT INTO A (val1, val2, val3, val4, val5) VALUES (1, 1, 1, 1, 1)
peloton: peloton/src/planner/insert_plan.cpp:109: peloton::planner::InsertPlan::InsertPlan(peloton::storage::DataTable*, std::vector<char*>*, std::vector<std::vector<peloton::expression::AbstractExpression*>*>*): Assertion `(columns->size() <= table_schema->GetColumnCount())' failed.

Example 2

CREATE TABLE A (val1 INT, val2 INT,  val3 INT,  val4 INT);
INSERT INTO A (id, val) VALUES (1, 1, 1, 1);
peloton: peloton/src/planner/insert_plan.cpp:122: peloton::planner::InsertPlan::InsertPlan(peloton::storage::DataTable*, std::vector<char*>*, std::vector<std::vector<peloton::expression::AbstractExpression*>*>*): Assertion `(col_cntr != INVALID_OID)' failed.

Ubuntu 16.04 x86_64 gcc

apavlo commented 7 years ago

@hzxa21 @chenboy @luochenUmich -- Can you help @tcm-marcel with this problem?

tcm-marcel commented 7 years ago

How does peloton in general react to invalid inputs at the moment? I just saw that adding an existing table also leads to a crash. Is this a general issue that invalid inputs aren't handled correctly?

apavlo commented 7 years ago

This is likely. @haojin2 is working on a related fix in #808

haojin2 commented 7 years ago

I think if the query is having invalid syntax then it's the parser's responsibility to handle it. If it has correct syntax but the underlying operation is illegal, then other parts should be responsible for handling that. I'm now working on throwing exceptions for bad syntax, not all invalid queries.

tcm-marcel commented 7 years ago

I may was inaccurate, it's not a syntax problem. So your PR hopefully will solve these ones. UPDATE: sorry, did not read correctly. If you are working on syntax problems, then this one here is another issue.

apavlo commented 7 years ago

Actually, I think that @tcm-marcel's example should be caught in the planner. The parser/binder doesn't "know" the number of columns that the table has.

Please confirm with @hzxa21 @chenboy @luochenUmich

hzxa21 commented 7 years ago

The binder can know the number of columns in the table by requesting the catalog. I think there are no major differences on doing the validity checks in binder or optimizer. I feel like we should do it in the binder because it is where the query first contacts the catalog.