gritzko / ron

(dated, see the site) Replicated Object Notation, a distributed live data format, golang/ragel lib
http://replicated.cc
Apache License 2.0
357 stars 7 forks source link

Op routing #34

Open cblp opened 5 years ago

cblp commented 5 years ago

My application has several nested data structures.

struct Contact : LWW {
    RGA<Char> name;
}

struct Note : LWW {
    RGA<Char> text;
}

Types Contact and Note form their own collections. I. e. objects of Note may belong to the Note collection only, and not any other collection.

We don't have such thing as RGA-per-field, so we have to store RGA fields as separate objects inside parent frames (chunks?) like this:

*lww #1 @1 :0  'Contact' 'name' >2 !

*rga #2 @3 :0  !
        @4     'c' ,
*lww #5 @5 :0  'Note' 'text' >6 !

*rga #6 @7 :0  !
        @8     'd' ,

Consider an op adding a character to the field Note.text of the note #5.

*rga #6 @9 :8  'e' ;

How to route this op to the collection Note and object #5?

Bad solution: keep index from sub-object ids to their parent objects and collections. This is wrong because requires actions to calculate information that is already calculated.

We need a way to supply raw ops (and reduced chunks, maybe, too) with routing information.

cblp commented 5 years ago

Possible solution: wrap raw ops in special raw chunks with paths in the payload.

*raw #0 @0 :0  'Note' >5 !
*rga #6 @9 :8  'e' ;

Drawback: what if reduced chunks need routing too?

cblp commented 5 years ago

Possible solution with both raw ops and reduced chunks

Raw op: wrap in raw chunks with paths in the payload

*raw #0 @0 :0  'Note' >5 !
*rga #6 @9 :8  'e' ,

Reduced chunk: wrap in chunk

The header op becomes the first op in the body.

*chunk #0 @0 :0  'Note' >5 !
*rga   #6 @9 :8  'e' ,

Query chunk: wrap in query chunk

The header op becomes the first op in the body.

*chunk #0 @0 :0  'Note' >5 ?
*rga   #6 @9 :8  'e' ,
cblp commented 5 years ago

Solution with *meta ops

Raw op: wrap in a patch chunk first

Every op is a patch, hence any raw op may be converted into a reduced patch chunk.

*rga #6 @9 :8  'e' ;

⬇

*rga #6 @9 :9  !
           :8  'e'

Reduced chunk: add *meta sub-ops

*rga #6 @9 :9  !
           :8  'e'

⬇

*rga  #6 @9 :9           !
            :8           'e'
*meta    @0 :collection  'Note'
            :database    'staging'

Description

Synopsis:

Routing metadata may be dropped when reducing to a state chunk.

Any other metadata may be added, including application-specific, language-specific annotations etc.

lambdafu commented 5 years ago

Sorry if I am completely out of line, but isn't a fact of RON that every object has unique identifiers, and that the LWW field contains a UUID reference to the RGA and the RGA is uniquely identified by its UUID? Why do you need routing information in such a system, where all objects are uniquely identified?

cblp commented 5 years ago

@lambdafu Routing, not identifying.

Consider an example. One node has many collections with many objects with many sub-objects. Another node has them too. Now you need to send an op from one sub-object to another. How will you find a place to put this op?

lambdafu commented 5 years ago

I feel out of depth, but isn't a simple way to just have a per-node dictionary that maps identifiers to objects? I admit that my picture of how large data models and application design is supposed to work in swarm is very sketchy at best.

cblp commented 5 years ago

@lambdafu Ok, we divide the database into many nodes, a node per object. Then do we need to establish a frame stream to each object?

Hm, sounds plausible, if we use some upper-level routing like HTTP stream with each query having its own URL.

cblp commented 5 years ago

I think @gritzko meant something like this in https://gritzko.gitbooks.io/swarm-the-protocol/content/spec.html

lambdafu commented 5 years ago

I didn't even realize you were talking about network routing, I thought this was all in-process. I have not given any thought to sharding, clustering, etc. My model is a simple one with peers and clients, where peers replicate everything and are fully connected, and clients subscribe to a subset of objects in one peer. Everything else is way over my head :smile_cat: