earthstar-project / replica-server

An always-online Earthstar peer for your shares.
GNU Lesser General Public License v3.0
5 stars 2 forks source link

Sync more efficiently and monitor progress #10

Open basham opened 2 years ago

basham commented 2 years ago

I'm syncing a local Deno replica server (v3.0.0) with a local browser app (Earthstar v9.3.2), trying both ExtentionSyncHttp and ExtentionSyncWebsocket. The replica that is being synced contains about 9,000 documents.

ExtentionSyncHttp will sync over the course of several thousand requests. Most of the time, it eventually stops when the browser cancels one of the requests, for unknown reasons. After a number of attempts, it finally was able to continue long enough so that the server and the browser now contains the same number of documents. However, when refreshing, it continues the same sync all over again. Thousands of new requests spanning at least 5 minutes.

Also, maybe I'm thinking about it wrong, but I'd expect the messaging activity from syncing with ExtentionSyncWebsocket to cease when the browser and server are caught up, because I think that's what happens when syncing over HTTP. However, 30 minutes in, there is still back and forth. I've inspected the messages, and I'm not noticing any documents, unlike what I've seen when inspecting equivalent HTTP responses. So, maybe there's a different bug happening there.

Is there a better way for the browser and server to communicate their current states and sync, with fewer requests/messages? Because this can take such a long time, there needs to be some way to monitor progress. In a browser app, the user will navigate quickly to new pages. There should be an efficient diffing process, so this long multi-minute syncing doesn't have to restart itself all the time. It should be able to essentially continue where it left off. I have not studied the source code enough to provide more particular suggestions at this time.

basham commented 2 years ago

Maybe the reason for the browser cancelling the HTTP request is due to some CORS issue that isn't fully resolved by my fix in #8? It's odd that it can make thousands of success requests, to just fail at some random point.

sgwilym commented 2 years ago

I'm pretty unhappy with the sync implementation myself, and will be reimplementing it for the next major version.

It doesn't remember what two replicas have synced between each other between syncing sessions. It will in the next version.

The current implementation also works by polling. In the next version, peers will stream new documents to each other without needing to be nagged about it. I expect that's why you're seeing all those empty messages.

Thank you for pushing Earthstar with these 9000 documents. :)