ohua-dev / ohuac

A standalone compiler for ohua algorithms
Eclipse Public License 1.0
1 stars 0 forks source link

Get joins working correctly #19

Closed JustusAdam closed 3 years ago

JustusAdam commented 4 years ago

I just added an extremely dirty hack to the noria integration. It has to be removed in the future, but it does seem to work for now. I am documenting here so we can clean this up at some point.

Background

In the noria integration we insert a join before any node with more than one parent. Each node carries some context columns on which we can join. Aka we might have node A with the output columns c1, c2 and the context (key) column k1. The full output of the node is then k1, c1, c2. When we join outputs we join on the key. So we would have another node B with say k1, c3. Then the join works on k1 and should output k1, c1, c2, c3. The k1 needs to be in the output, in case we wish to join more stuff onto it. However for some reason when I do A join B the noria join node outputs k1, c1, c2, None, c3. I think the None is actually it trying to output k1 again. So what I did, because I suspect that the None has something to do with k1, is to record the output columns as k1, c1, c2, k1, c3. This seems to run correctly, because subsequent nodes just ignore the second k1 column, and in theory doing it like this should also do the expected thing for multi-keys.

Fix

We need to find out how exactly join works with respect to what columns it produces. We may even have to wait until it can join on the same table, because currently that can only be achieved by cheating.

We may also find out that this is actually the way it works and then we can just make the hack more robust.