Closed serapath closed 3 years ago
count cobox in for a discussion about standards. we're keen to have cobox and datdotorg be as easily interoperable as possible :-)
Count mapeo also in, although I think we wouldn't adopt a standard that includes a 'manifest feed' -- I would propose scoping back the proposal to make minimal changes to the existing codebases.
Adding an optional 'manifest' to the handshake that includes all discovery keys to be replicated is what we do in multifeed and what we could commit to with mapeo.
Could this be adapted for corestore as a way to request all the discovery keys present in it? Right now, corestore only does 1 at a time using a discovery-key
event. cc @andrewosh
fwiw i think whatever ends up working for mapeo will probably also be able to work for cabal, but maybe @noffle can correct me on that (having experience w/ both) when she has spoons to do so
also kudos on starting the discussion this way, @serapath! i think this is a great approach
thx everyone for you support :-) I was a bit anxious putting it together and it's still a bit messy.
thx @cblgh thats nice to hear - i'm really trying, but it's pretty tough for me to wrap my head around all the issues with the different approaches :-)
thx @dan-mi-sun that's lovely to hear :-) and thx @okdistribute for your support :-)
Adding an optional 'manifest' to the handshake that includes all discovery keys to be replicated is what we do in multifeed and what we could commit to with mapeo.
Yes, I know. I checked the source code - but ...
Could this be adapted for corestore as a way to request all the discovery keys present in it? Right now, corestore only does 1 at a time using a discovery-key event.
mafintosh said he would not support an extension message based approach, because it lacks the "trust guarantees" of a feed based approach.
I understand some issues with a manifest feed
, which according to mafintosh is a performance hit, but on the other side, it could be an additional feature. I was talking to @mafintosh and he actually recommended me the approach to use the Custom Header
that DEP-0007
specifies, but also to use a manifest feed
because that's how there are guarantees about the related feeds, while using just extension messages lacks the trust and the history feeds with merkle trees provide.
@martinheidegger commented on the linked issue above, but i will respond here:
I was talking to mafintosh and am aware of the performance issue, which is why I was thinking about the manifestfeed as something additional.
The first message in a hypercore is already now supposed to include information which identifies the data structure type according to DEP-0007 - so this is already an existing standard and ideally data structures and/or protocols should already support it.
The good part of the manifestfeed is, that it gives you better guarantees, while raw extension messages can't be trusted at all to my knowledge. The manifestfeed is the approach recommended by mafintosh too.
edit: related details are written down in a section of a comment here: https://github.com/playproject-io/datdot-research/issues/17#proposal-first-draft
What about listing all the approaches and then specifying multiple standards?
e.g. we define some kind of DEP-0011
and specify multiple approaches, like:
DEP-0007
approachmanifest feed
approach to get all related feedsmanifest extension mesage
approach to get all related feedsThen any data structure can choose one of the different approaches (we can add more to the list above if needed) and for each approach we can define and list the reasons and the features or pro's and con's. So a service (e.g. a "generic hosting service" like datdot) that needs to know the (data structure) type
and/or the related feeds
can try all the approaches we specify and hopefully one of them will work :-) Additionally it's possible to support the specialised approaches of each individual type that a service encounters to have improved performance, which is what is probably needed for hyperdrive anyway.
that would work!
edit: adding on top of above comment https://github.com/datproject/comm-comm/issues/134#issuecomment-604807991
What if multiple approaches are supported but give conflicting answers regarding which feeds are related? I'd like to avoid that.
otherwise...
The manifest feed approach would need to know message 0 of the main feed and then decide:
hyperdrive
, we anyway have to deal with itmanifest feed header
, we roll with thatmanifest feed header
we proceed with manifest extension message
All in all - the feature and the order in which the mechanisms take precedence are meant fot be used in cases where a peer doesn't know the data structure and wants to get related feeds for (e.g. pinning and stuff...), while keeping other existing mechanisms that data structures use for performance, which means it could also be an opt-in feature for loading after you get the initial data through e.g. an extension - that only certain peers will bother invoking?
That order could be included in the specification, so conflicts would not happen, because the process stops before an alternative approach could even mention something which is conflicting and as it stands, for multifeed
it would always trigger (3.) and for hyperdrive it would always trigger (1.) and for other corestore related data structures that will exist in the future - hopefully it will trigger (2.) or if they choose to do so, it could also trigger (3.) :-)
@serapath Do you still wish this to be open? Should we announce this further?
I can take the gist of what was said here and move it to a different place and reference this issue and then it can be closed. When we have more progress, I can open an issue again :-)
Do you think that would be better?
How about the datprotocol/deps repository?
@serapath let's keep the announcement here just for times when you need feedback that we can collect at comm-comms. As I understand, at this points other people's input can not really help this issue, right? Then let's close this issue and open again when a clear question to the community arises.
small ping that cabal is very interested in this kinda thing atm :3 especially with regard to enabling synchronizing of encrypted hypercores using a single identifier
i.e. each log from a cabal is encrypted, and the set of logs for a particular cabal is discoverable over the identifier, just a long cabal key c.f. ciphercore's blindKey
we are very very busy trying to finally get the first MVP ready to finish our first milestone, but all of this is still very high priority. Our first milestone will support only hypercore itself, but as soon as we finally start our second and then third milestone, which are actually long overdue we urgently need to work on making more complex data structures work which use more than one hypercore, so there is a lot of time reserved to get back to this very issue :-)
@cblgh If you or cabal also maintains an issue, that would be great. Let's continue talking and pushing the details forward. Maybe we can do this immediately in parallel :-)
yes I think cobox might also be interested in this
@dan-mi-sun do you have an issue in cobox that tracks this proposal? Would be cool to link those kind of issues there, so there is always a place to come and visit and where projects can summarise their thoughts about this proposal and express what they agree with or what they would like to see changed :-)
That would make it easier to address things and be sure people or orgs and their concerns or ideas or rather wishes are not forgotten or lost :-)
@cblgh ...so same for cabal, would be good if you made a cabal/kappa/... issue somewhere and maybe link this one here? I for sure will read and follow it :D
regarding url schemes and protocol handlers, i added some thoughts here:
@okdistribute i like the proposal you made 18 days ago and prepared myself already for that, but I guess that is no longer valid for reasons that are beyond my understanding. Lack of maintainers doesn't seem to be the reason, because I at least offered myself and to just take care of interoperability standards and nothing else seems manageable.
edit: adding on top of above comment https://github.com/datproject/comm-comm/issues/134#issuecomment-604808606
currently "related feeds proposal" tells you a parent -> children
relation, but it won't tell you the direction in reverse. There might be use cases where that is important to relate in the opposite direction, like:
Maybe there are many ways to solve use cases and those proposals should be seperate, but I will list below why they actually might be ...related :-)
related feeds proposal consists of chunk0 specifying:
manifest feed
(or hyperdrive, or kappa extension messages) => ("parent pointing to children")certificate feed
=> ("parent pointing to grandparent" - if one exists)certificate feed key could work like this: a standard to write into the first chunk of any feed some information about a "(self signed) certificate feed" (for announcements of revoked feed keys and/or replacement feeds and their keys)
certificate feed key
will subscribe to that feed and/or it's parent certificate feed key
, etc... if they exist.revoke-and-replace message
is received from a certificate feed about a feed that authorized that certificate feed in it's chunk0, that client will from then on listen to that new feed instead, and expect the new feeds merkle tree to be identical from their chunk 0 up to a particular length from which things are supposed to be a seemles "continuation" of the previous hypercore and continue as normalJust to summarize yesterdays off-thread conversation: I don't think that you will be able to make a "certificate"-feed that is helping against the loss of private keys. Any additional feed that you post doubles the risk (doesn't half it) - our conclusion was: You need to keep it safe: period.
In some sense dat-dns or better: HyperDomains (system as outlined by @pfrazee here) would solve the issue of updating an existing reference.
As far as the issue about feed-corruption: I did write down my thought on this topic once: https://gist.github.com/martinheidegger/82dbf775e3ff071d897819d7550cb3d7 - I think it might be a reasonably solution to maintain an existing dat. But this solution is an edge-case that is hard to implement and test for and generally speaking it is understandable why we rather focus on making sure that the case becomes less-and-less likely - (the case was very common with dat 1.0, now with core-store it should be significantly reduced).
While the questions of feed identity and feed corruption are interesting, they seem to be distracting from this issue about common related-feeds. Dont you think?
Note: i mistook the gist link (they are unfortunately named similar) and updated it to the reflections on core healing
We had a discussion on this during the last dat conference: https://youtu.be/hzIU5X7g7PI
The content in this issue is quite long: @serapath would you be okay with closing this issue and maybe open one (or more) issues that summarize the current state?
Yes, I will open one or more issues and summarize the current state.
I'm quite busy right now so this will take a bit more time, but my perception is also that nobody needs the solution right away and is urgently waiting for it.
If anyone is reading this comment and needs it super urgent and wants to discuss things sooner, let me know - in that case I can see if I can do it sooner.
Closing the issue for now, looking forward to updates.
Deadline:
no deadline
Link: https://github.com/playproject-io/datdot-research/issues/17#issuecomment-602563335 Call for Action: please read through and give some feedback/suggestions/questions/...