Closed Samyak2 closed 2 years ago
For the scope of this project I think we only need a basic evaluator, where the most important operators might be those that produce boolean values in the end, i.e. A > B, A = B
etc such that we can effectively test the executor.
I know the evaluation of boolean expression is still on the way.
A questions comes into my mind.
TRUE
andFALSE
are essentially1
and0
, or I should say any non-zero and zero.Hence, the following queries are equivalent. Will we even cast integer / float as boolean value implicitly? Or, we only accept boolean value / expression only, else throwing error / panic?
select * from tbl where TRUE and FALSE; select * from tbl where 1 and 0; select * from tbl where -1 and 0; select * from tbl where 1.01 and 0.0;
I was thinking of supporting booleans (or expressions that evaluate to booleans) only. Having "truthy" and "falsy" values has been a common footgun for me in JavaScript and Python, so I wouldn't want to emulate that. But of course, we aren't making a new language here, we should emulate what other DBs do.
The MySQL documentation doesn't specify whether it only supports booleans - I will have to test it out. The PostgreSQL docs says any expression that evaluates to a result of type boolean
- so it is consistent with my idea.
The comparison and equality operators are currently all derived, but I think we should impl it ourselves and return explicit errors in case of non-matching types.
Note: the logical AND
and OR
will be implement in the Expr
executor as short circuiting needs to be implemented for it.
Actually, I think the derived implementations are good enough for now.
Cool
Do you think it's good to merge. Or do you still have something to add?
Not yet. I'm working on the tests, but two things are yet to be implemented - ColumnRef
and Function
.
The problem with ColumnRef
is that it needs to be in the context of a Row
to return a Value
. I think it would be best to handle both of these in the overall VM executor instead of this - since it needs a Row
as well the table schema to know which index in the Row
it needs to return. What do you think?
Not yet. I'm working on the tests, but two things are yet to be implemented -
ColumnRef
andFunction
.The problem with
ColumnRef
is that it needs to be in the context of aRow
to return aValue
. I think it would be best to handle both of these in the overall VM executor instead of this - since it needs aRow
as well the table schema to know which index in theRow
it needs to return. What do you think?
I'll put Function
aside and focus on ColumnRef
first.
I think Expr::execute()
method should be responsible to evaluate a expression and return a single Value
. (this is the current design) The "context" (db table & row) comes into play when we're evaluating ColumnRef
.
ColumnRef
should contains the RegisterIndex
where the data of a table / projection located.Expr::execute()
method should also takes a row_id
indicating the current row. This allows ColumnRef
expression to be evaluated as a single Value
(consider a table with rows and columns; the column is given in the ColumnRef
and we got the row_id
as well).This way, VM iterate and evaluate expressions row by row then append result to the destination register.
Does this make sense? @Samyak2 @tyt2y3
Yeah unless I missed some details, we should pass the row and all the row's data to the evaluator, along with the column metadata.
Such that when we say we want 'id = 1' it will just get the value from the row, and evaluate the boolean expression.
Yes we can forget about function for now.
I think the Function
is not part of Expr
. It should be a seperate struct / enum where it contains two pieces of data: the function operator (e.g. MAX
, MIN
, COUNT
, IS_NULL
) and the expression (where it can be evaluated as Value
). However, I'm not sure to represent this on the intermediate code level. A special kind of projection?
I don't think function belongs to the instruction set. All the IC care is execute a lambda and get some value.
The expression evaluation doesn't care about function either, it's just calling a lambda and get some Value back.
It could be a rabbit hole if we dig down too far. Let's just do something passable for now and revisit expression evaluation later . The IC and VM is what matters more for now because I basically need to see a POC of the execution pipeline, from SQL parsing to query results. And iterate from there filling in the details.
I have implemented a basic execution of ColumnRef
for now. It needs more tests which I will wrap up tomorrow.
I'm working on the overall executor in a different branch. @tyt2y3 can we merge this?
Nice
PR Info
Adds
Fixes
Breaking Changes
Changes