ipfs / notes

IPFS Collaborative Notebook for Research
MIT License
401 stars 30 forks source link

onename profiles #57

Open jbenet opened 9 years ago

jbenet commented 9 years ago

Let's put all onename profiles on IPFS.

Steps (afaik):

Would be useful to maintain a pre-computed index on all the data and keep a head of it at a well known place (bound to ons/onename, ipns, and dns). the index can be verified against any blockchain, but allows fast access. (i assume https://github.com/blockstack/blockstore already computes such an index, i mean making it accessible directly to any ipfs node too)

jbenet commented 9 years ago

cc @muneeb-ali @jcnelson

@jcnelson do you want me to track replication to https://github.com/jcnelson/syndicate here too? or want to track that separately? (asking first before assuming :) )

muneeb-ali commented 9 years ago

Blockstore already has a bunch of "storage drivers" (what you referred to as "backends"):

https://github.com/blockstack/blockstore-client/tree/master/blockstore_client/drivers

Vanilla Linux, DHT, S3 to be more precise. Although it'd be quick to port some "drivers" we wrote earlier for Syndicate to this. I believe this is where the IPFS driver can go. One concern I have is that since these are currently implemented in blockstore_client (and with good reason) there might be redundant work required down the road if someone comes up with a client in another language (which is already happening). Just something to keep in mind.

Replicating the data part should be pretty straight forward.

The pre-computed index of all the data (human-readable key, hash(data)) currently exists in a DB with blockstore. The merkle hash of this global state is announced in the blockchain with new operations.

For Syndicate, things are a little different because Syndicate follows the design of using "importers" for pulling in different types of data into Syndicate. @jcnelson can confirm, but my understanding is that for Syndicate instead of going the driver route, Jude will just implement an "importer" in Syndicate itself.

The driver model will hide use of IPFS from blockstore users. If someone wants to mount the blockchain ID namespace and the associated data directly via IPFS, what interface will IPFS provide?

jcnelson commented 9 years ago

There would only need to be a generic Syndicate driver for Blockstore. Syndicate already has Python bindings to make this possible.

Syndicate itself is designed to handle interfacing with back-end storage providers, intermediate CDNs, external datasets, data indexes, and application-defined storage logic (like deduplication, encryption, access logging, replica placement, etc.) on behalf of applications like Blockstore.

@jbenet Not sure what you're asking?

muneeb-ali commented 9 years ago

I think he is asking if it's OK to discuss Syndicate mirroring here vs. on the Syndicate github repo

jcnelson commented 9 years ago

Ah, okay. Let's keep this discussion in one place :)

jbenet commented 9 years ago

Sorry guys, ended up unable to visit and work on this last week. But some outlining of what we need to do:

Something that would help-- could you:

So to make ipfs-backed-blockstore we need to:

@judenelson: i suspect it would look similar for syndicate? o/ maybe drop an equivalent task list here?

Separately, to add onename resolution to /ipns/<name> paths in ipfs, what we'll need to do is:

jcnelson commented 9 years ago

point to the pieces of code that abstract out storage in blockstack/blockstore? (file ideally)

There are eight methods to implement: get/put/delete for immutable and mutable data, a one-off initialization method, and a method to generate a driver-interpreted URL to mutable data.

Example disk driver: https://github.com/blockstack/blockstore-client/blob/master/blockstore_client/drivers/disk.py.

describe the relevant data structures briefly? (the types, basically)

Immutable data, mutable data, and routes.

Immutable data is unchanging and content-addressed--the hash for an immutable datum is embedded in a user's profile directly, and the hash of the user's profile is embedded in the blockchain. Immutable data has very high authenticity and integrity guarantees (as strong as the blockchain), but at the cost of having to send a transaction each time the user puts or deletes an immutable data record.

Mutable data is URL-addressed, and is atomically signed and versioned by the writer. The URLs and writer public key are treated as a specially-crafted piece of immutable data (called a route), but the data the URLs refer to can be written and rewritten at line rate by the writer. Readers check and cache the version for each mutable data record to avoid stale data, and use the public key to verify the data and version's authenticity. While writes to mutable data are much faster, the downside is that a malicious network or storage provider can deny readers fresh data by hiding new writes; we hedge against this by giving the user the choice of storage providers, and replicating to a set of them.

An immutable datum is a binary string. Mutable data and routes are JSON documents that adhere to this schema (taken from https://github.com/blockstack/blockstore-client/blob/master/blockstore_client/storage.py)

# mutable storage route
ROUTE_SCHEMA = {

   "id": schemas.STRING,
   "urls": [ schemas.STRING ],
   schemas.OPTIONAL( "pubkey" ): schemas.STRING
}
# mutable data schema
MUTABLE_DATA_SCHEMA = {

   "id": schemas.STRING,
   "data": schemas.B64STRING,
   "ver": schemas.INTEGER,
   "sig": schemas.B64STRING
}

describe any relevant media that should be accessible as regular posix files? (e.g. images)

Not sure if "should" is the right word. Blockstore's client library and command-line tool already provide a JSON-RPC interface for getting, putting, and deleting mutable and immutable data.

If you wanted to abstract these records as files, my recommendation would be:

@judenelson: i suspect it would look similar for syndicate? o/ maybe drop an equivalent task list here?

The Syndicate driver would write the serialized JSON records as files under a given directory in the user's Syndicate volume (not too different from how the disk driver works).