swindon-rs / swindon

An HTTP edge (frontend) server with smart websockets support
Apache License 2.0
100 stars 9 forks source link

feature: a set crdt from which members can be removed #58

Open jjl opened 6 years ago

jjl commented 6 years ago

an ORSWOT, maybe?

tailhook commented 6 years ago

Well, I'm not sure about ORSWOT, because this requires enumerating all the actors. I'm not sure how we should treat backend processes.

But on the other hand, I'm pretty fine with expanding the set of CRDT's supported. Can you briefly describe your task, so we can see what fits it? Maybe classic OR-Set or something completely different.

jjl commented 6 years ago

I don't have a single use-case in mind, I'm just not particularly thrilled with a data structure that grows without bound. On this front, PN counter isn't great either without some garbage collection mechanism.

tailhook commented 6 years ago

Yes, this bothers me too. In practice, however, we delete items as needed from the database, and keep them in swindon as long as user session is active (remove after some timeout after last user connection).

This means it works if user sessions aren't indefinite, which is true for most browser-based sessions (swindon's main use case), and the potential set of items isn't indefinite.

On the other hand, If we have OR-Set for the same use-case we would need to keep exact same number of items in memory for tombstones (along with id's, so memory usage would actually be larger).

I'm not familiar with ORSWOT, all I know about it from 5-min googling. I have to investigate it more, if you see how it can be used in swindon's case your help is appreciated.


And yet another note: for many use cases (such as when actual data is backed by a DB) you can just version your data and push version via swindon. This solves about 80% of use cases and doesn't have any memory leak problem.

jjl commented 6 years ago

I'm going to go back and read some of the literature. I'm no expert in CRDTs myself, I just think this is a cool project that's got my attention :)

jjl commented 6 years ago

Okay, so I've been back and looked at ORSWOTs in more detail. There are no tombstone nodes and they prune data so it's relative to data size not operations performed.

The idea I was floating about was to have swindon maintain an ORSWOT of connected nodes for a given service, so that membership is resolved without the need of some external service (although I know what's not quite in line with your other projects).

jjl commented 6 years ago

Oh, and someone's implemented it in rust

tailhook commented 6 years ago

There are no tombstone nodes and they prune data so it's relative to data size not operations performed.

Here is how I understand it:

  1. ORSWOT relies on vectorclocks
  2. Then vectorclocks rely on a mapping from each agent to a counter
  3. Now it works fine on riak where there is a fixed number of agents

In swindon, we have three kinds of agents:

  1. swindon(s) themselves, they are easily enumerable
  2. clients (web, mobile) which are ephemeral, there might be ten of clients for a single user at one moment and then none of them at another (note: we may use one value to sync thousands of users)
  3. backends which usually are stateless workers so are kinda ephemeral too

So we risk unbounded grow of agent number which might easily be larger than the actual set.

The idea I was floating about was to have swindon maintain an ORSWOT of connected nodes for a given service, so that membership is resolved without the need of some external service

Well, another thing is that swindon doesn't have any persistency. It even actively prunes data from memory which is unused by currently connected users. So I'm not sure we're on the same page here.

although I know what's not quite in line with your other projects

I'm not sure, what do you mean?