qri-io / rfcs

Request For Comments (RFCs) documenting changes to Qri
MIT License
12 stars 6 forks source link

rfc, remotes: Remotes are like a do-it-yourself Registry #36

Closed dustmop closed 5 years ago

b5 commented 5 years ago

@camhunt, after our conversation we Qri folk spent some time thinking through the "can we run our own registry" question, and we came up with this remotes RFC. This starts with adding the capacity to "push" datasets to other places. We'd like to use this as a foundation for building some of the other features surrounding trusted peer coordination we discussed. Any feedback would be appreciated!

b5 commented 5 years ago

Data together friends (@frijol, @meiqimichelle, @titaniumbones, @lightandluck), if we get this into Qri this would let us run the Compute Canada Data Together node as a remote, which would mean qri publish --remote data-together would push to that infrastructure without needing to go through a server Qri runs, helping prove our "no single point of failure" story.

I'm cc'ing y'all in case you're interested in tracking progress, or would like to steer the design of this feature. I'm delighted to hear people want the capacity to aggregate their own data 😄.

dustmop commented 5 years ago

Thanks for the feedback @b5 and @Frijol! I updated the text on Friday that incorporates your comments and hopefully clarifies some of what's going on.

At that level, my question is: are these registry nodes (here called remotes) functionally different from the Qri Registry?

The idea is that a "remote" is something that anyone can run to keep their data alive and provide features like search and user identity within a limited scope, while the "registry" is only run by Qri, and acts as the default hub that also provides such features globally. Hmmm, perhaps I can find somewhere to explicitly call out this difference in the rfc itself...

titaniumbones commented 5 years ago

Hey, thisi s super-interesting and I am following along. Too braindead to offer much in the way of substantive response but just flagging my interest here.

camhunt commented 5 years ago

@b5 this initial approach for remotes helps mitigate a near-term challenge we are facing, and we'd start using it immediately.

Has there been any thought/discussion in leveraging gossipsub so that any node that subscribes to a channel will automatically replicate any data set advertised?

This doesn't solve the bad actor problem (which we could solve - crudely - by a whitelist of nodes which any given node would auto-replicate new data sets), but it would be a more decentralized/distributed solution. When combined with IPFS public proxies, it would even mean that nodes that can't directly connect to each other (e.g., due to NAT issues) could still replicate data sets around (per gossipsub topic, assuming the public proxy is also gossiping).

b5 commented 5 years ago

@camhunt

this initial approach for remotes helps mitigate a near-term challenge we are facing, and we'd start using it immediately.

Great to hear. We're very much thinking of this as a first step, I'll consider you a yay vote on ratifying this RFC.

Regarding subscribing to datasets: I've started a discussion over on #37 that we can start turning into an RFC in the coming days. feedback welcome!

b5 commented 5 years ago

Ok, just pushed a few spelling tweaks & a moved paragraph. I'm a 👍 on merging this. You in @ramfox @dustmop, @Frijol?

b5 commented 5 years ago

nice. ok that's a majority approval from core team. LGTM!