dat-ecosystem / comm-comm

Community Communication
https://comm-comm.datproject.org/
MIT License
89 stars 8 forks source link

a DEP proposal for a standard about `related feeds` #134

Closed serapath closed 3 years ago

serapath commented 4 years ago

Deadline: no deadline Link: https://github.com/playproject-io/datdot-research/issues/17#issuecomment-602563335 Call for Action: please read through and give some feedback/suggestions/questions/...

dan-mi-sun commented 4 years ago

count cobox in for a discussion about standards. we're keen to have cobox and datdotorg be as easily interoperable as possible :-)

okdistribute commented 4 years ago

Count mapeo also in, although I think we wouldn't adopt a standard that includes a 'manifest feed' -- I would propose scoping back the proposal to make minimal changes to the existing codebases.

Adding an optional 'manifest' to the handshake that includes all discovery keys to be replicated is what we do in multifeed and what we could commit to with mapeo.

Could this be adapted for corestore as a way to request all the discovery keys present in it? Right now, corestore only does 1 at a time using a discovery-key event. cc @andrewosh

cblgh commented 4 years ago

fwiw i think whatever ends up working for mapeo will probably also be able to work for cabal, but maybe @noffle can correct me on that (having experience w/ both) when she has spoons to do so

also kudos on starting the discussion this way, @serapath! i think this is a great approach

serapath commented 4 years ago

thx everyone for you support :-) I was a bit anxious putting it together and it's still a bit messy.

thx @cblgh thats nice to hear - i'm really trying, but it's pretty tough for me to wrap my head around all the issues with the different approaches :-)


thx @dan-mi-sun that's lovely to hear :-) and thx @okdistribute for your support :-)

Adding an optional 'manifest' to the handshake that includes all discovery keys to be replicated is what we do in multifeed and what we could commit to with mapeo.

Yes, I know. I checked the source code - but ...

Could this be adapted for corestore as a way to request all the discovery keys present in it? Right now, corestore only does 1 at a time using a discovery-key event.

mafintosh said he would not support an extension message based approach, because it lacks the "trust guarantees" of a feed based approach.

I understand some issues with a manifest feed, which according to mafintosh is a performance hit, but on the other side, it could be an additional feature. I was talking to @mafintosh and he actually recommended me the approach to use the Custom Header that DEP-0007 specifies, but also to use a manifest feed because that's how there are guarantees about the related feeds, while using just extension messages lacks the trust and the history feeds with merkle trees provide.


@martinheidegger commented on the linked issue above, but i will respond here:

I was talking to mafintosh and am aware of the performance issue, which is why I was thinking about the manifestfeed as something additional.

The first message in a hypercore is already now supposed to include information which identifies the data structure type according to DEP-0007 - so this is already an existing standard and ideally data structures and/or protocols should already support it.

The good part of the manifestfeed is, that it gives you better guarantees, while raw extension messages can't be trusted at all to my knowledge. The manifestfeed is the approach recommended by mafintosh too.

serapath commented 4 years ago

edit: related details are written down in a section of a comment here: https://github.com/playproject-io/datdot-research/issues/17#proposal-first-draft

What about listing all the approaches and then specifying multiple standards?

e.g. we define some kind of DEP-0011 and specify multiple approaches, like:

  1. to get feed (data structure) type use DEP-0007 approach
  2. use the manifest feed approach to get all related feeds
  3. or use the manifest extension mesage approach to get all related feeds

Then any data structure can choose one of the different approaches (we can add more to the list above if needed) and for each approach we can define and list the reasons and the features or pro's and con's. So a service (e.g. a "generic hosting service" like datdot) that needs to know the (data structure) type and/or the related feeds can try all the approaches we specify and hopefully one of them will work :-) Additionally it's possible to support the specialised approaches of each individual type that a service encounters to have improved performance, which is what is probably needed for hyperdrive anyway.

okdistribute commented 4 years ago

that would work!

serapath commented 4 years ago

edit: adding on top of above comment https://github.com/datproject/comm-comm/issues/134#issuecomment-604807991

What if multiple approaches are supported but give conflicting answers regarding which feeds are related? I'd like to avoid that.

otherwise...

The manifest feed approach would need to know message 0 of the main feed and then decide:

  1. if the message tells it's a hyperdrive, we anyway have to deal with it
  2. if it tells manifest feed header, we roll with that
  3. if it doesnt have a manifest feed header we proceed with manifest extension message
  4. ... @okdistribute you mentioned maybe we could use a different multifeed like structure on the chat, but you didn't go into details what you meant with that.

All in all - the feature and the order in which the mechanisms take precedence are meant fot be used in cases where a peer doesn't know the data structure and wants to get related feeds for (e.g. pinning and stuff...), while keeping other existing mechanisms that data structures use for performance, which means it could also be an opt-in feature for loading after you get the initial data through e.g. an extension - that only certain peers will bother invoking?

That order could be included in the specification, so conflicts would not happen, because the process stops before an alternative approach could even mention something which is conflicting and as it stands, for multifeed it would always trigger (3.) and for hyperdrive it would always trigger (1.) and for other corestore related data structures that will exist in the future - hopefully it will trigger (2.) or if they choose to do so, it could also trigger (3.) :-)

martinheidegger commented 4 years ago

@serapath Do you still wish this to be open? Should we announce this further?

serapath commented 4 years ago

I can take the gist of what was said here and move it to a different place and reference this issue and then it can be closed. When we have more progress, I can open an issue again :-)

Do you think that would be better?

okdistribute commented 4 years ago

How about the datprotocol/deps repository?

martinheidegger commented 4 years ago

@serapath let's keep the announcement here just for times when you need feedback that we can collect at comm-comms. As I understand, at this points other people's input can not really help this issue, right? Then let's close this issue and open again when a clear question to the community arises.

cblgh commented 4 years ago

small ping that cabal is very interested in this kinda thing atm :3 especially with regard to enabling synchronizing of encrypted hypercores using a single identifier

i.e. each log from a cabal is encrypted, and the set of logs for a particular cabal is discoverable over the identifier, just a long cabal key c.f. ciphercore's blindKey

serapath commented 4 years ago

we are very very busy trying to finally get the first MVP ready to finish our first milestone, but all of this is still very high priority. Our first milestone will support only hypercore itself, but as soon as we finally start our second and then third milestone, which are actually long overdue we urgently need to work on making more complex data structures work which use more than one hypercore, so there is a lot of time reserved to get back to this very issue :-)

serapath commented 4 years ago

@cblgh If you or cabal also maintains an issue, that would be great. Let's continue talking and pushing the details forward. Maybe we can do this immediately in parallel :-)

okdistribute commented 4 years ago

yes I think cobox might also be interested in this

serapath commented 4 years ago

@dan-mi-sun do you have an issue in cobox that tracks this proposal? Would be cool to link those kind of issues there, so there is always a place to come and visit and where projects can summarise their thoughts about this proposal and express what they agree with or what they would like to see changed :-)

That would make it easier to address things and be sure people or orgs and their concerns or ideas or rather wishes are not forgotten or lost :-)

@cblgh ...so same for cabal, would be good if you made a cabal/kappa/... issue somewhere and maybe link this one here? I for sure will read and follow it :D

serapath commented 4 years ago

regarding url schemes and protocol handlers, i added some thoughts here:

serapath commented 4 years ago

How about the datprotocol/deps repository?

@okdistribute i like the proposal you made 18 days ago and prepared myself already for that, but I guess that is no longer valid for reasons that are beyond my understanding. Lack of maintainers doesn't seem to be the reason, because I at least offered myself and to just take care of interoperability standards and nothing else seems manageable.

serapath commented 4 years ago

edit: adding on top of above comment https://github.com/datproject/comm-comm/issues/134#issuecomment-604808606

currently "related feeds proposal" tells you a parent -> children relation, but it won't tell you the direction in reverse. There might be use cases where that is important to relate in the opposite direction, like:

  1. in case you lost your private key or accidentally corrupted your old feed and want to revoke
  2. or if a feed has been accidentally corrupted

Maybe there are many ways to solve use cases and those proposals should be seperate, but I will list below why they actually might be ...related :-)

related feeds proposal consists of chunk0 specifying:

certificate feed key could work like this: a standard to write into the first chunk of any feed some information about a "(self signed) certificate feed" (for announcements of revoked feed keys and/or replacement feeds and their keys)

  1. A client listening to a feed which specifies a certificate feed key will subscribe to that feed and/or it's parent certificate feed key, etc... if they exist.
  2. Whenever a revoke-and-replace message is received from a certificate feed about a feed that authorized that certificate feed in it's chunk0, that client will from then on listen to that new feed instead, and expect the new feeds merkle tree to be identical from their chunk 0 up to a particular length from which things are supposed to be a seemles "continuation" of the previous hypercore and continue as normal
martinheidegger commented 4 years ago

Just to summarize yesterdays off-thread conversation: I don't think that you will be able to make a "certificate"-feed that is helping against the loss of private keys. Any additional feed that you post doubles the risk (doesn't half it) - our conclusion was: You need to keep it safe: period.

In some sense dat-dns or better: HyperDomains (system as outlined by @pfrazee here) would solve the issue of updating an existing reference.

As far as the issue about feed-corruption: I did write down my thought on this topic once: https://gist.github.com/martinheidegger/82dbf775e3ff071d897819d7550cb3d7 - I think it might be a reasonably solution to maintain an existing dat. But this solution is an edge-case that is hard to implement and test for and generally speaking it is understandable why we rather focus on making sure that the case becomes less-and-less likely - (the case was very common with dat 1.0, now with core-store it should be significantly reduced).

While the questions of feed identity and feed corruption are interesting, they seem to be distracting from this issue about common related-feeds. Dont you think?

Note: i mistook the gist link (they are unfortunately named similar) and updated it to the reflections on core healing

serapath commented 4 years ago
martinheidegger commented 4 years ago

We had a discussion on this during the last dat conference: https://youtu.be/hzIU5X7g7PI

The content in this issue is quite long: @serapath would you be okay with closing this issue and maybe open one (or more) issues that summarize the current state?

serapath commented 3 years ago

Yes, I will open one or more issues and summarize the current state.

I'm quite busy right now so this will take a bit more time, but my perception is also that nobody needs the solution right away and is urgently waiting for it.

If anyone is reading this comment and needs it super urgent and wants to discuss things sooner, let me know - in that case I can see if I can do it sooner.

martinheidegger commented 3 years ago

Closing the issue for now, looking forward to updates.