mafintosh / hyperdb

Distributed scalable database
MIT License
753 stars 75 forks source link

Consider hypercore dependency injection #137

Open andrewosh opened 6 years ago

andrewosh commented 6 years ago

Just tossing this one out there:

Replacing the top-level storage parameter with an asynchronous hypercore factory might be a nice abstraction for a few reasons:

First, storage management is already performed by hypercore, and is effectively duplicated in hyperdb -- it'd be nice to be able to write a single module that wraps storage (and networking) for a set of hypercores, then passes those up to hyperdb.

Second, introducing one level of indirection would allow us to inject objects proxies that mirror the hypercore interface. As an example use-case, consider applying permissions to local dbs (i.e. you have N writable dbs being accessed by K local programs, where only P < K programs should be able to perform writes):

Currently one would have to replicate the writable db into a read-only copy before passing it to a restricted program (correct me if I'm wrong, but I believe other approaches could be circumvented by inspecting the hypercore object). With a proxy, direct access could be transparently replaced by RPC calls that contain an auth token, for example.

andrewosh commented 6 years ago

It's worth mentioning a third use-case as well: locking writes to a single writable db across processes. In the current version, you'll quickly get into trouble attempting this (without a writer per-process), but RPCing requests to a single writer process would solve this.

Edit: One writer per process is likely the way to go in most cases, but I don't want to rule out this pattern being useful in certain local-only scenarios!

fsteff commented 6 years ago

This would also allow injecting modified hypercore versions (or wrappers), like my hypercore-encrypted. Currently I use a fork of hyperdb that requires my module instead of the normal hypercore.

The main problem is that this would be a breaking API change...

Frando commented 6 years ago

Interesting. I am currently exploring how to handle multiple writers on a single machine without having to duplicate the database on-disk.

One writer per process is likely the way to go in most cases

Is there already a way to make that work currently? Because the "local" feed and the corresponding secret key would differ per writer, while the other feeds I want to not duplicate (for disk space reasons). I though of implementing a custom storage function that maps "local" to "peer/[current_writers_key]" but was wondering if there is a cleaner solution (or if it is considered clean to remap path names in the storage function).