Anna-Team / AnnaDB

Developer-first database
Apache License 2.0
59 stars 4 forks source link

Collection Context Lost when joined #14

Open amaster507 opened 2 years ago

amaster507 commented 2 years ago

There are times, when the collection context would be beneficial. Knowing not just the data that is linked to, but that it is linked data from another collection. How does one know they are updating data from a different collection. This is assuming that when inserting data with links, the linked data nodes are not copied but are just "linked"

I propose some kind of structure where you keep the context of the linked collection in the tree.

Let me know if you need some examples of my thoughts and reasonings here.

amaster507 commented 2 years ago

Given a query, and no change of state, a developer should be able to predict how the data will change when updated.

image

If I see that I have the current data above and I want to run, the following mutation, how do you suspect the data will change?

image

Without having the context of the items where they are linked, you will assume wrong 100% of the time. Really stop and think of what the first query will return next without scrolling down...

There were no changes of the state of data between the query, the update, and the following query:

image

Did you expect it to look like this? There was no way to tell that this item was linked to 2 other places in the first query. And with the exactly same raw data, you have no other way to identify such.

amaster507 commented 2 years ago

This isn't perfect, but it is my idea on a way to solve this. The last query could return:

result:ok[
    response{
        s|data|:objects{
            issue14|83f22bc2-f133-4615-8164-feee16643e04|:m{
                s|issues|:issue14|eef25b1e-5e02-45b3-bff3-706238a66b7f|v[
                    n|1|,
                    n|2|,
                    n|3|,
                ]|,
            },
            issue14|163fdcbd-f5ca-4f83-9518-691affb82a72|:issue14|eef25b1e-5e02-45b3-bff3-706238a66b7f|v[
                n|1|,
                n|2|,
                n|3|,
            ]|,
            issue14|ac5d6908-fb8f-4f2b-bcb7-469eb1faacb1|:v[
                n|1|,
                n|2|,
            ],
            issue14|eef25b1e-5e02-45b3-bff3-706238a66b7f|:v[
                n|1|,
                n|2|,
                n|3|,
            ],
            issue14|c9424a1e-140d-44da-b68b-362b6f4d95e9|:m{
                s|issues|:v[
                    n|1|,
                    n|2|,
                ],
            },
        },
        s|meta|:find_meta{
            s|count|:n|5|,
        },
    },
];

Notice that where this item is linked, it appears as %collection%|%id%|%primitive%|

This breaks the normal pattern of only 2 pipes, but I think the value it brings will help remedy this situation and give context to the orginating collection as well. Having the context of the collection and id will be required to implement a more performant caching mechanism. Once a cache reaches a item reference, it references a new item in the cache. This item only has to be cached once instead of multiple times. This will be important for working with wide and deep graph patterns of data.

amaster507 commented 2 years ago

Been working with Tyson a bit more working through the decoder, and I think this is the best theory yet on how to keep collection context. Introduce the separator ^. And use this for collection context.

So items would in this theory have two syntaxes:

%collection%|%id%| Used for reference only in queries. %collection%^%id%|%value%| Used in responses that include the values of the collection.

In theory the syntax for context would behave like any other primitive. Collection context decoding would just have one extra step inside if you want to parse out the actual collection and id values separating with the ^.

Adding/Removing the context would be a simpler task of (un)wrapping the item context from the values.

I also suggest returning objects as a vector instead of a map with this new context, leaving the above example responses to then look like:

result:ok[
    response{
        s|data|:objects[
            issue14^83f22bc2-f133-4615-8164-feee16643e04|m{
                s|issues|:issue14^eef25b1e-5e02-45b3-bff3-706238a66b7f|v[
                    n|1|,
                    n|2|,
                    n|3|,
                ]|,
            }|,
            issue14^163fdcbd-f5ca-4f83-9518-691affb82a72|issue14^eef25b1e-5e02-45b3-bff3-706238a66b7f|v[
                n|1|,
                n|2|,
                n|3|,
            ]|,
            issue14^ac5d6908-fb8f-4f2b-bcb7-469eb1faacb1|v[
                n|1|,
                n|2|,
            ]|,
            issue14^eef25b1e-5e02-45b3-bff3-706238a66b7f|v[
                n|1|,
                n|2|,
                n|3|,
            ]|,
            issue14^c9424a1e-140d-44da-b68b-362b6f4d95e9|m{
                s|issues|:v[
                    n|1|,
                    n|2|,
                ],
            }|,
        ],
        s|meta|:find_meta{
            s|count|:n|5|,
        },
    },
];

items with context look then look the same at every level in the response.

With this you will probably need to introduce some reserved collection names such as n, b, s, uts, v, m, null

roman-right commented 2 years ago

Yes, a separate delimiter in the prefix for sub-typing sounds nice!