janmonschke / diffsync

Enables real-time collaborative editing of arbitrary JSON objects
MIT License
222 stars 23 forks source link

Running the diffsync server in a browser #1

Open seidtgeist opened 9 years ago

seidtgeist commented 9 years ago

Bullet points because I can't brain right now:

cc @diasdavid

seidtgeist commented 9 years ago

I really don't think it's a problem to have a server peer, and there might be ways to get around that (can operations converge @janmonschke?). The beauty of this is that any client could be an impromptu server peer, and I can't find reasons what would be wrong with that. In that way diffsync sessions are ephemeral. Let's imagine we pair diffsync collaboration with a content adressed store:

  1. Request JSON value for hash
  2. Open diffsync server peer with value of hash
  3. Have diffsync clients connect to server peer and collaborate on hash (see gist for how simple this is)
  4. Periodically, or on demand, save new values to cas store
  5. When server dies, negotiate new server peer
  6. Session ends when nobody wants to edit anymore
janmonschke commented 9 years ago

In theory, all that sounds doable and even Neil Fraser also thought about a P2P system when he introduced the algorithm in a Google Tech Talk -> https://youtu.be/S2Hp_1jqpY8?t=2591. His version of the implementation however was based on pulling data periodically rather than having a dedicated signal for updated versions like diffsync has. So he concluded that the sync-delay between those nodes could increase significantly. I don't hink this latency problem applies here.

My knowledge of P2P systems and WebRTC is pretty limited so I'm not sure if I would be able to implement it on my own. Especially regarding systems that negotiate a single source of truth in P2P environments. I don't know how the death of a node can be handled gracefully so that no data is lost. Also: consider that a node could be determined as the server-node which is actually run on a mobile phone. Handling all diffsync sessions could drain the battery a lot and the signal might get lost at any given time.

Don't get me wrong, I'm not against the idea, I'm just unsure how P2P systems handle the described cases ;)

Another point to consider would be if P2P should be part of the core package. I personally would say it should be it's own module that is implemented on top of an enhanced transport interface -> https://github.com/janmonschke/diffsync#socketio-independence.

geyang commented 9 years ago

The difficult thing is not the transport layer. It is managing a distributed system on a network topology. This is why Neil said that it is a different problem from the synchronization algo itself. He did not intent to say that diffSync is a peer to peer synchronization algorithm, because that requires more than diffsync itself.

The problem is to manage the interconnect between clients, and figure out a way to propagate this global topology information to each client, and be able to tolerate changes.

A simple algo is to do client hopping. I am planning on doing something fun in this direction when I got more time. It may be never.

Are you guys familiar with IRC? The graph theory part of IRC is relevant.

Cheers.