S3 for backup / restore and database streaming

urbien commented 3 years ago

Purpose

We have the following reasons for using Hypercore with AWS S3 (and its many open source re-implementations, like https://min.io):

Eleven nines of durability for Hypercore feeds. We need to match the reliability of the cloud, or will be hard to sell this to end users.
Streaming the database from s3. This is

Rationale

for 2. above: Website without a web server, served from S3 is mainstream now. Why not a database? to me it is revolutionary - and I can think of a couple of use cases (CDN, huge research databases), but there must be a ton.

Our push is to make Hypercore-based apps for dumb dumb users - we are all so spoiled by google email, docs, etc. taking care of everything for us, the P2P solution needs to match that and exceed it with what google will never do. Cloud-level Reliability of P2P seems to be on Hypercore team's radar.

To me reliability is being 100% online, 100% durable, and 100% connectable to on the network

Starting point in Hypercore and what is lacking:

hypercore-archiver with random-access-s3 provide the basic method for piping a feed to S3. But random-access-s3 write method is not implemented

Context

While S3 supports random reads, that is a read from the offset in S3 object, there is no support for random writes:

S3 does not support updating file at an offset (Google S3 PUT range request)
S3 object update is inefficient, you need to PUT the whole file again
S3 Multi-part upload allows multiple chunks, which are assembled by S3 automatically into an S3 object. There is no way to update chunks

Restore

Perceived immediate availability

When restoring from S3 need to make it available before all data is downloaded. This should work like AWS EBS drive recovery from a snapshot. Although restore process is still taking place, EBS drive is already made available.

Key management

How to assume ownership of the restored Hypercores on a new machine with a different private key? See #5

Proposed approach

May be we can learn from the Search engine Lucene,. To avoid updating the whole index on every document add / update (which is extremely costly), Lucene writes new data into a chunk, which it calls index segment. On search it reads from all segments and merges the results. Here is how we can mimic this in Hypercore backup to S3, which has similar performance bound for our case:

Each type of Hypercore (Drive, Bee, Trie) will have its own shards. Once feed on disk reaches about 5mb, start a new shard and copy the whole feed dir to S3. Need to write to both the main Hyperbee and to a last shard of Hyperbee, which is not so cool. Maybe write to the last shard in mem? But then need to have a marker in main Hyperbee on log seq of when the last shard started. Ideally we need a checkout after the seq N.

Create a separate feed for the new segment / chunk.
Dump to s3 the whole feed.
Merge all feeds later into one feed and dump this to S3.
Before that merge, serve all peer requests to our feed from virtually merged feeds

pgmemk commented 3 years ago

How segmentation is going to work for collaborative editing? What I mean is

if the segment is already merged with other segments and stored in S3, it is immutable for changes,
what is gonna happen when somebody else is trying to change the content. How will it be updated?
Will it be merged with the whole file that is in S3 and new version of this file be stored in S3.
If yes, how to decide that it's a merge time? If after the first merge every change will be merged with the whole file, then maybe there is no point of segmentation.

How to decide when to merge segments?

How big the segment should be?
What signals that the document is finished ready for merge?

urbien commented 3 years ago

@pgmemk good catch, I did not think of that. Now, here is an an idea how to address updates in data items that were already archived / backed up to S3. Maybe we could use union mounts, which are upcoming in Hypercore. Hyperdrive will use the mounts for shared folders. Union mount creates an impression that you can update friend's Hyperdrive that is today mounted read-only. Union mounts were invented by Plan9 and them over a decade were implemented and re-implemented a ton of times in Linux. They are now stable for a long time in Linux and are used for booting from an immutable source, like DVD, but still being able to customize your installation, and boot again with all your changes to immutable files saved separatly from the original DVD.

urbien commented 3 years ago

Initial idea was to use mounts for a virtual merge. But Hyperbee does not have mounts. And Hypertrie mounts change names of the keys. Key 'a' becomes 'mountpoint/a'. New ideas are needed.

urbien commented 3 years ago

perhaps this module could help? https://github.com/little-core-labs/hypercore-multipart

tradle / why-hypercore