Closed asdine closed 4 years ago
[1] http://infolab.stanford.edu/~ullman/fcdb/aut07/slides/ra.pdf
Slide 4 says that for union, intersection and difference both operands should have the same schema. What is your thought on this? How could we handle this since we could have schema-less relations.
This is a good point, PR #87 won't add union, intersection, and difference, so we don't need to ask ourselves this question for the moment. This will be taken care of during the next iterations. I will probably write a proposal when we will reach that point
Currently, there is one separate type for each kind of statement, containing the entire logic of that statement. The parser only needs to determine which type to instantiate and give all the necessary information to that object so that it can be run.
There are several problems with this approach:
Having a better designed base would allow us to:
Proposal
To solve that, we can build an intermediary representation of a query in the form of a tree. A tree represents a stream of documents, from an input (table, index, etc) to an output (result stream, deletion from the table, update, etc.).
Each node of the tree represents an operation that will transform the stream (filter documents, project fields, renaming fields, etc.). These operations will follow some principles of relational algebra [1] [2].
With this representation, the difference between a SELECT and an UPDATE is only about changing one node of the tree.
The tree would be built by the parser before being passed to the execution phase.
Once we have this tree, we can optimize it using Rule-based optimization: Each rule is a function that takes one tree and returns a new tree. Rule-based optimization can be summarized like this:
Each rule does only one optimization:
Once the tree is optimized, we can transform it into an actual
document.Stream
that can be executed.Some reads about relational algebra: [1] http://infolab.stanford.edu/~ullman/fcdb/aut07/slides/ra.pdf [2] https://www.geeksforgeeks.org/introduction-of-relational-algebra-in-dbms/