mediachain / concat

Mediachain daemons
MIT License
42 stars 13 forks source link

Extending the directory for publisher/dataset discovery. #71

Closed vyzo closed 7 years ago

vyzo commented 7 years ago

The initial implementation of the directory provides minimal facilities for peer discovery: a plain list of peers, which can then be contacted and queried for the contents of their datastore.

This is fine for baseline implementation, and it can be argued that any other functionality on top is unnecessary: All relevant info can be extracted by querying peers directly, so listPeers is a sufficient interface. This may be a little inconvenient for mcclient interaction in the command line, but aleph is moving towards a repl/electron app where such functionality can be easily implemented.

On the other hand, it is desirable to have simplified publisher discovery through the directory for curation and UX purposes and it is something that will become even more pronounced with identity provider integration (#14).

So what kind of lookup facilities do we want to support in the directory protocol?

A suggestion for the most basics: Let's start with the information published by a node registering with the directory. This can be the node info, together with a statment db manifest that includes namespaces, and perhaps statement counts and aggregated publisher/sources. With ID integration, we can also include operator info and signatures that bind the node to a real world entity.

This information can be accessed with a new directory method:

We may also want to scope the method with an additional parameter; cf Universe, Contrib, and Glam directories.

denisnazarov commented 7 years ago

Some thoughts:

Making it not matter what machine statements are on. Mediachain will feel more of a "universal library" if we remove the importance of what machine the data is on from the query interface. That way we can just ask for "statements by moma" and get a response of who has them, instead of needing to know the node upfront. Additionaly, this is a stronger sell to participants because once they publish their data and as long as someone is replciating them, they are "in Mediachain". They don't have to explicitly keep their machine running anymore.

@vyzo 's thoughts on implementation from Slack:

The simple way is to do a query on your db for publisher/sources and then publish a manifest that says what sources/publishers you are aggregating. Then it can be a matter of providing an api to search for a publisher based on this info

Blockstack as decentralized identity/node info/manifest store. Blockstack seems to be the perfect solution to decentralize human readable names mapping to node/publisher IDs as well as manifest files and other metadata. This way anyone can run a directory instance using Blockstack as the source of truth/index.

From the product side, a bitcoin-secured identity system is a really compelling sell for participants. Registering bitcoin-based identity is a very "secure" operation that makes participation in Mediachain seem like a big deal. It is essentially the act of singing up for and joining the universal media library.

Blockstack lets us tell a compelling story as well as decentralize the directory index.

parkan commented 7 years ago

It's important to keep in mind that so far we're basically working with "universe" and haven't really thought about any reputation or veracity aspects. I wonder if now is a good time to consider those, since it'll probably inform some design aspects at this level.

vyzo commented 7 years ago

101 implements a baseline which allows peer listing within a namespace, and also retrieval of all namespaces.

I think this baseline is sufficient for universe, where we don't try to apply any reputaiton filtering or connect to an external identity.

Perhaps we can also add publisher listing/lookup, as this could be useful to discover the source of some statement (if it is public).

For further functionality, we need to research blockstack integration and also extended node info [#99].

vyzo commented 7 years ago

So for next steps, now that we have manifests with identity provider integration. we can steer the publisher discovery towards entity ids. The straightforward way to implement useful functionality in mcdir, is to support manifest lookup by entity ids.