Debugging transfer issues between Beaker and Hypercloud

pfrazee commented 7 years ago

In my debugging between beaker and a localhost hypercloud, I'm finding the first issue is that hypercloud only requests dats when the connection is first established.

What's the setup? I start Beaker with two dats being hosted from it. I open a fresh install of hypercloud, then POST dat1 to hypercloud. That replicates fine. Then I POST dat2. That does not replicate for an inconsistent number of minutes (2-5).

Why this breaks replication: Dat1 establishes a replication-connection btwn Beaker and Hypercloud. Hypercloud only requests Dats at the beginning of the connection. Therefore it's not until the first connection is dropped, and a new connection is made, that replication of dat2 occurs. See https://github.com/mafintosh/hypercore-archiver/blob/master/index.js#L92-L99

Why isnt a second connection created for the newly-POSTed dat? I could be wrong, but discovery-swarm appears to avoid opening multiple connections between two peers, using the _peersSign hash. See https://github.com/mafintosh/discovery-swarm/blob/master/index.js#L200

Should the first connection take 2-5 minutes to end? I'm unsure. @mafintosh, should it? I can understand why the connection would stay alive for more data to come through. I could also understand if the connection was closed after all work was finished.

What are our options?

Make connections close as soon as all work finishes. This isn't a very good solution. If you happen to be transferring a big archive, the connection won't close quickly.
Have hypercloud ask for newly-added archives during replication. This is what I suggest in this issue. However, that only works for an active-replication strategy. As discussed in this issue, active-replication is a bad strategy for interacting with clients. It means that, if a hypercloud has N archives, that it will make N requests per connection. That becomes a problem really fast.
Allow multiple connections to occur between peers, simultaneously. AKA, don't multiplex, and stop tracking peersSeen in discovery-swarm. I'll explain the advantage of this below.
A variant of 3. If a peer is discovered by discovery-swarm that's in peersSeen, and it was for a different archive's swarm, emit an event so we can trigger replication. I think this is the winner

Why is option 4 the winner, in my opinion? Connections created by discovery are sort of like HTTP requests that have the GET /path at the head. They enable us to know exactly why the peers connected in the first place. Blindly multiplexing additional requests is pretty much a shot in the dark. It's akin to saying, "hey while we're at it, do you happen to have archive 2?"

Option 3 has the same advantage, but it involves creating more connections. Option 4 avoids that.

Thoughts? @mafintosh @joehand @maxogden

max-mapper commented 7 years ago

I delegate my answer to @mafintosh

pfrazee commented 7 years ago

From discussion on IRC:

Mafintosh considers the current behavior a bug, and says option 3 is what ought to be happening. We could make option 4 an additional behavior, but we need to figure out the complexity first.

In option 4, that event might be emitted a lot, because re-discovery of the same peer happens quite a bit. If we can limit it to emit once per discoveryKey/peer combo, then we'd be set.

dat-ecosystem-archive / hypercloud

Debugging transfer issues between Beaker and Hypercloud #56