Open ABresting opened 7 months ago
@jm-clius @waku-org/waku
Also, I would like to say that we should aim for a solution geared towards specific apps.
I believe that apps using TWN will naturally form sync groups among themselves. Meaning an App would have couple of TWN nodes but only sync messages it cares about.
Supporting that should be our first priority IMO.
Only then should we think about general store provider that would store all message because it would be a more general use case.
Also, I would like to say that we should aim for a solution geared towards specific apps.
I believe that apps using TWN will naturally form sync groups among themselves. Meaning an App would have couple of TWN nodes but only sync messages it cares about.
Oh yes 100%, That's also what I have figured from Status way of functioning, XMTP implementation, Tribes requirements and a nice brainstorming session with @chaitanyaprem!
Supporting that should be our first priority IMO.
Only then should we think about general store provider that would store all message because it would be a more general use case.
I am wondering if we should let client somehow provide configuration parameter that allows it to make a Prolly tree (or some other Sync mechanism) based on content topic since most of the client nodes will be interested in their content topic that serves their apps.
I am wondering if we should let client somehow provide configuration parameter that allows it to make a Prolly tree (or some other Sync mechanism) based on content topic since most of the client nodes will be interested in their content topic that serves their apps.
If the Sync mechanism is Prolly tree based, a sync request becomes a set diff. The diff of the 2 local trees becomes the hash list of message to send to the other node, it's beautifully symmetric!
Thanks for opening up this issue, @ABresting!
A couple of comments:
Sync request can be triggered
At some point we may want to periodically sync while the node is online too, ensuring less fragmented histories due to unnoticed down periods or other short lapses in good connectivity.
Sync request is passive
This seems fine for now as a simple evolution of Store requests and responses. If we build a sync mechanism that periodically syncs, though, we may want to take inspiration from GossipSub's IHAVE and IWANT mechanisms where nodes also periodically advertises which messages they HAVE and others request what they WANT (fewer round trips)
outdated client...when receives a Sync request
In the simplest version of this protocol, I envision it could simply be a better Store protocol, with HistoryQuery
either for a list of message hashes or the full contents belonging to such message hashes? In this case, if the other node doesn't support this version of the Store protocol, libp2p would fail to establish a protocol stream (dial failure). This happens before the service-side can respond with an error code within the protocol.
One thing that is important for the baseline understanding is to consider the layered architecture here and where the synchronisation mechanism lives:
The Store protocol itself can evolve to exchange information about keys (message hashes) and full message contents. However, the store node would still need to be able to determine which hashes it's missing and request the full contents for these from other store nodes. In the simplest, but most inefficient, version of such an architecture, the Store node would have to query its own archive backend (the key-value store, which is likely a DB such as postgres) for a full list of keys and compare this with a full list of keys it receives from other nodes (who are doing the same inefficient DB queries).
However, if we introduce some efficient "middle layer" here between the DB/archive backend and the Store protocol, we could vastly improve the efficiency of doing a "diff" between the indexes/message hashes known to both nodes. The Store protocol would still be responsible for communicating which message hashes it knows about, comparing it to those known by other nodes and finding what's missing, but with an efficient way to compare its own history with those in other nodes. One such method is building efficient search trees, such as the Prolly trees described here: https://docs.canvas.xyz/blog/2023-05-04-merklizing-the-key-value-store.html The archive would remain the persistence layer underlying all of this - any DB/storage/persistence technology that is compatible with key-value storage.
With this option, we will not change the Store protocol - it will remain a way for clients to query the history stored in Store service nodes according to a set of filter criteria. However, the Store nodes themselves would build on some synchronised mechanism with its own protocol for synchronising between nodes (e.g. GossipLog based on Prolly Trees). The archive would remain the persistence layer where the synchronised messages are inserted and retrievable when queried.
In this option the Store protocol would not have to be modified and we won't need to introduce any "middleware" to effect synchronisation, messageHash exchange, etc. Instead, the Store protocol would assume that it builds on top of a persistence layer that handles synchronisation between instances. For example, all Store nodes could somehow persist and query messages from a Codex-backed distributed storage for global history with reliability and redundancy guarantees. A simpler analogy would be if all Store nodes somehow have access to the same Postgresql instance and simply write/query from there.
If the Sync mechanism is Prolly tree based, a sync request becomes a set diff. The diff of the 2 local trees becomes the hash list of message to send to the other node, it's beautifully symmetric!
I like this!
Weekly Update
achieved: Clarity on Store sync protocol, nearly finalized (creating visual images/diagrams) the research document to explain architecture and issues with potential approaches.
next: prepare the workshop for Store sync and publish the research document on prolly tree with Waku use case.
Weekly Update
achieved: baseline clarity on how and what Sync protocol will be done, supplementary Waku node parts interaction document.
next: a workshop with team folks to reach an agreement on how Sync store will look like.
Weekly Update
achieved: PoC of Prolly Tree (fixing a Bug), insertion and deletion of data into it.
next: a writeup about Prolly trees PoC in issue, further testing, generating some operational data details such as memory consumption using RLN specs.
Weekly Update
achieved: PoC of Prolly Tree feature complete, Postgres retention policy PR, diff protocol ground work started.
next: pending technical writeup about Prolly trees PoC in issue, Diff protocol, generating some operational data details such as memory consumption using RLN specs.
Weekly Update
achieved: 1-day work this week due to time off, nim implementation of Prolly trees
next: Diff protocol discussion, Sync mechanism on wire query protocol discussion, generating some operational data details such as memory consumption using RLN specs.
Sync store is a vital feature of Waku protocol where a node can synchronize with peer-nodes hoping to get missing messages while the node was offline/out-of-activity. Every message in Waku protocol can be uniquely identified using a
messageHash
, which is a DB attribute. Using themessageHash
it gets easier for nodes to identify if their store has that certain message. The following are the potential features of the Waku store sync:There are some open questions such as:
Eventually, after establishing the understanding and operating details of the Prolly tree-based Synchronization mechanism, the integration of the Synchronization layer into the Waku protocol requires careful consideration, ensuring a deep understanding of its operational nuances and a thoughtful approach to its implementation. #73
Topics such as incentives to serve sync requests are kept out of this document's scope.