gritzko / swarm-ron-docs

https://gritzko.gitbooks.io/swarm-the-protocol/content/
54 stars 8 forks source link

overview? #1

Open dominictarr opened 8 years ago

dominictarr commented 8 years ago

This looks promising but it's difficult to assemble an understanding of how it all works, because the documentation has too much detail. for example, I don't need to know how swarm-protocol represents numbers yet, just that it does. pseudocode or some other terse notation that represents the structure of the protocol is most important.

Another good aid in understanding is to describe the class of applications which could be built on top of swarm. You mention collaborative editing. could swarm scale to something the size of wikipedia?

Also, you mention partial checkouts - this would certainly help for a wikipedia-scale app, because most peers could not replicate the entire dataset. How does partial checkouts interact with fault tollerance?

this isn't enough information about signatures https://github.com/gritzko/swarm-protocol-docs/blob/master/crypto.md#signatures pseudocode would be very helpful there.

dominictarr commented 8 years ago

I'm gonna take notes about on my reactions to the sections I read here...

https://github.com/gritzko/swarm-protocol-docs/blob/master/matrix.md#entanglement-matrix

hmm, does that mean each peer has to track the state of P^2 (P=peers)? does that also apply to partial check outs?

A client replica may need one additional step to use an entanglement matrix. Namely, it has to link the op of interest to the nearest home peer's noop. Practically, that is a hash chain validation. Once a client can see that the op is covered by a home peer's noop, it may track the progress of the entanglement matrix till it shows majority acceptance of the op in question. Note that the client does not need the full log or the full entanglement matrix. The quorum proof can be made with a segment of the op log, in both cases.

aha, so how does tho quorum proof work?

dominictarr commented 8 years ago

https://github.com/gritzko/swarm-protocol-docs/blob/master/replica.md

The Swarm core itself assumes the role of a peer as a node that keeps the full operation log of the database. Peers can serve clients that have either a full log or (most often) an arbitrary subset of objects

so "peers" do not do partial checkouts, but clients can get a partical checkout from a peer, and still verify some properties of it.

dominictarr commented 8 years ago

in https://github.com/gritzko/swarm-protocol-docs/blob/master/peer_handshake.md#peer-handshake

In a repeated handshake, peers mention the timestamp of the last op received from each other in the past

so, when peers handshake, they just send their current local ts, and the ts they know for the peer. that means the handshake is small, but that means that if A syncs with B, then B with C, and then A with C, that C will resend data from B that was already known to A, because A did not know that B had that data?

gritzko commented 8 years ago

@dominictarr First of all, thank you for this format. I'll try to be equally systematic by expanding docs on each case instead of one-off replies.

hmm, does that mean each peer has to track the state of P^2 (P=peers)? does that also apply to partial check outs?

added:

Even in a super-peer network, the number of peers may be large. An entanglement matrix has a size of O(N^2), i.e. quadratic. The fact that we need such a big data structure may seem depressing at first. Practically, a regular peer or client hardly needs the full matrix. A replica may need to know which of its own ops are reliably disseminated. To answer that question, it needs one column from the matrix (and a little bit more to implement recursion, but O(N) anyway).

gritzko commented 8 years ago

This looks promising but it's difficult to assemble an understanding of how it all works,

Expanded the intro at https://gritzko.gitbooks.io/swarm-the-protocol/content/

gritzko commented 8 years ago

Another good aid in understanding is to describe the class of applications which could be built on top of swarm. You mention collaborative editing. could swarm scale to something the size of wikipedia?

We discussed that with Erik Moeller back in time when he was CTO of WMF. By now, I think, all the issues are addressed and something the size of Wikipedia can be done. Not the Wikipedia per se because (1) Real-time/synced Wikipedia does not make much sense. (2) Their legacy stuff is way too interdependent.

The research behind Swarm originated in some distributed-Wikipedia proposals back in 2010. https://wikimania2010.wikimedia.org/wiki/Submissions/Federating_Wikipedia http://www.slideshare.net/gritzko/wikisym-deep-hypertext-slides

In big-O terms, Swarm is Cassandra-level scalable. No linearization, no coordination, unlimited sharding.

gritzko commented 8 years ago

if A syncs with B, then B with C, and then A with C, that C will resend data from B that was already known to A, because A did not know that B had that data

@dominictarr Exactly. It is op-based, so I believe it is simpler to send it twice than to optimize that. In theory, peers can be arranged in a spanning tree to ensure every op is received once by every peer. On the client side, the optimization is totally worth the expense, but on the server side, I believe, redundancy can be cheaper than optimization (not my idea, actually). Any optimization that requires coordination (i.e. network round trips) is definitely more expensive. I added a discussion on relay orders/guarantees: https://gritzko.gitbooks.io/swarm-the-protocol/content/order.html

I don't need to know how swarm-protocol represents numbers yet

I separate the model/behavior part from the protocol primitives part: https://gritzko.gitbooks.io/swarm-the-protocol/content/SUMMARY.html

gritzko commented 8 years ago

Also, you mention partial checkouts - this would certainly help for a wikipedia-scale app, because most peers could not replicate the entire dataset. How does partial checkouts interact with fault tollerance?

Snapshotted Wikipedia (no full letter-precise history) is tens of gigabytes, not Big Data by any standard. https://dumps.wikimedia.org/enwiki/20160701/ Swarm is a super-peer network, so Swarm nodes are either

  1. always-on well-provisioned full-dataset peers (read: federated servers) or
  2. likely-mobile often-online limited-storage partial-dataset clients.

It is possible to shard a peer, of course. Wikipedia with the full history will need that. Fault tolerance mostly exists at the peer level. As a client

  1. has a partial compressed dataset (no history)
  2. only talks to its peer,

there is not much space for maneuvers at the client level. There is a way to rebase a client from one peer to another, so offline changes will not be lost if the peer dies. And that's it.