akka / akka

Build highly concurrent, distributed, and resilient message-driven applications on the JVM
https://akka.io
Other
13.04k stars 3.59k forks source link

CRDT passivation #27714

Open jroper opened 5 years ago

jroper commented 5 years ago

Here are two ideas for increasing the number of top level CRDTs in ddata. Both are based on the idea of an idle passivation timeout. CRDTs would be passivated if they were not interacted with for a given time period, and activated if they were interacted with again. A similar timestamp tracking mechanism to the one proposed in #27683 would be used.

The first is that idle entities should be gossiped less frequently. There can be two gossip periods, one for active CRDTs, one for idle. The first period might be 2 seconds, while the second might be, for example, 2 minutes. This could be used to save on latent bandwidth and CPU consumed by gossiping. It would also result in active CRDTs being prioritised in bringing a new node up to date. I don't imagine this would be hard to implement. This doesn't reduce the memory requirements for CRDTs, but it does give a lot more room on the network side of things when you have a very large number of CRDTs.

The second is that idle entities could be passivated to disk/some durable store. This would require enabling durability. Basically, the replicator would simply drop references to idle CRDTs. Then, it would need to fallback to a lookup from the store on every miss, misses would include receiving local get/update, as well as receiving gossip status messages and deltas. This would completely remove the memory and latent network overhead of idle CRDTs. By itself, it would not reduce the cost of adding a new node (it would increase it because that node would require all CRDTs to be loaded from disk), but if combined with #27715, then new nodes could start with no active entities, and activate entities from the store as they receive gossip status messages that show timestamps that are active.

patriknw commented 5 years ago

The latter would probably be more difficult to implement, I mean require massive refactoring because all access to the local Map would now have to support asynchronous access.

Also, if a new node is joining and that is triggering all entries to be loaded into memory anyway we don't solve the upper limit problem. Meaning that we probably need a different sync protocol for new nodes. That could be good for other reasons (faster bulk transfer) but is a bigger task.

I find #27683 more appealing, to start with.

patriknw commented 5 years ago

by the way, once I started a prototype of using an off heap Map for entries, but since it was a spare time project I gave up when it became too much work

jroper commented 5 years ago

Yeah, there's a lot to think about, and agree it would be a huge change (btw, I'm not raising these issues as a demand for the feature, but just as a place to discuss and record thoughts, because these types of things have been coming up in discussion about CloudState, so it's good to have a variety of inputs to help ground discussions about whether work should be invested in addressing the issues).

Though I don't think a new node necessarily has to trigger all entries to be loaded into memory - the node might only load entries that it sees when it receives gossip messages from other nodes - there would be a lot of edge cases around this though.

patriknw commented 5 years ago

good to know the background, thanks