Open ilyaigpetrov opened 6 years ago
One thing I'd also like to note is that I'm working on having dat-gateway automatically inject a DatArchive polyfill so that sites that make use of it can work without any extensions.
Hey! Been really great to see the work @RangerMauve and others are doing on gateway and related work. You are right on the benefits of having the gateway. While an official gateway can be helpful it could also lead to copyright & legal issues beyond the resources of our nonprofit. We'd like to keep the scope of the Dat Project focused on the core technology to ensure sustainability in the long-term, and using resources otherwise may detract from that.
It's really great to see the community efforts around this and we'll continue to support them however we can.
Concerning datbase.org...
Dat Base was intended a registry not necessarily a gateway and I'm not sure we could cover all uses without doing subdomains.
Hi, I'd like to hijack this issue to talk about gateways and the such.
There was a meeting with some people from the Dat community to talk about gateways and getting Dat working in the browser on Wednesday Feb 27. Here are the meeting notes (courtesy of @substack)
We also spoke about this on Thursday the 28th at the dat comm-comm call.
The gist of it is that it'd be nice to find a standard way of doing dat stuff in the browser.
The main pieces that I see (feel free to add more) are:
I looked at some this stuff a while ago when I was working on dat-polyfill and again recently when working on dat-js.
I propose meeting on Wednesday February the 6th at 20:00 GMT to discuss this stuff and maybe start putting parts together. We could use audio-only setups with https://talky.io/dat-in-browsers to talk about it.
Personally, I'd like to work on combining the signalhub / discovery-swarm-stream code so that we could support replicating multiple hyperdrives through both WebRTC and proxying to the discovery-swarm all with a single websocket connection. (Also integrating hyperswarm once that stabilizes)
Does that time work for you all? Is there a better date or time? Any other items that I could add to the list of stuff to talk about?
CC @garbados @substack @karissa @frando @dpaez @tinchoz49 @gozala
Also, I'm making a calendar invite. Email me at ranger@mauve.moe
if you'd like to be added to the calendar.
- A standard and efficient way of working with webrtc-swarm (Something similar to signalhubws??)
I think a good way of accomplishing this is to publish a set of capabilities to the peer table when joining a swarm. So for example if I am a browser I can communicate as a websocket-client and webrtc-peer. And if I'm an electron app I can communicate as a websocket-client, websocket-server (if I have a public IP or can hole-punch), tcp client, tcp server (if I have a public IP or can hole-punch), udp etc. Peers could also publish their connection preferences to the table. Then to make a peer connection, clients can consult this table along with their own heuristics to make the best connection possible according to some mutually acceptable preferences.
These kind of hybrid swarms would be very useful for merging what would otherwise be fairly separate networks based on transport protocols.
Apologies if this is already the plan, although if so I guess this comment will help to disambiguate.
Yeah, that's a great idea! One thing I was thinking of is that it'd be cool if these gateway servers published their existence to the discovery swarm under a know key. Then you could connect to one and discover more through it and potentially save them for later.
@pvh o/
Re: the capabilities. We should discuss (on the call) how to do this stuff without reimplementing libp2p. 😅
Re: the capabilities. We should discuss (on the call) how to do this stuff without reimplementing libp2p. 😅
Is there reason to not collaborate on libp2p itself?
Reading this thread I was feeling: oh that’s exactly goal of libp2p, which also happens to have rust implementation so you could in theory wasm it.
I think for the same reason it's healthy to have both KDE and Gnome, or Mozilla and Firefox, or Linux and FreeBSD, it's not wise to create a monoculture.
The dat community and the IPFS community have different ethos, technical goals, funding models, development methodology and values. I think both should inspire and be inspired by each other and drive one another to improve but it doesn't make much sense to me for the dat community to adopt the IPFS codebase.
As for web gateways, if you've been following the work Ink & Switch has been doing, we've been discussing something morally along the lines of the DatArchive injection (though I think we envision a quite different actual implementation) to extend our system to non-Electron computers like iPhones and browsers.
Roughly, because a first-order goal for us to is to support totally offline usage we've discussed bridging hypermerge repositories over a websocket gateway but also wrapping all of that magic in a PWA that kept the data in localStorage (or something) for improved durability.
@pvh Would you be interested in attending the call?
Also ping @sammacbeth. He's using gateway stuff in https://github.com/cliqz-oss/dat-webext
I think for the same reason it's healthy to have both KDE and Gnome, or Mozilla and Firefox, or Linux and FreeBSD, it's not wise to create a monoculture.
The dat community and the IPFS community have different ethos, technical goals, funding models, development methodology and values. I think both should inspire and be inspired by each other and drive one another to improve but it doesn't make much sense to me for the dat community to adopt the IPFS codebase.
I think there are few caveats here that is worth considering:
Please note that does not imply that:
I apologize for derailing this conversation, it just saddens me that instead of making greater progress towards decentralization communities across the board choose to keep reinventing the same wheel which slight technical differences. It could be that overhead of coordination across groups would have higher overhead than the value to be gained by that, but that's rarely argument.
@Gozala you're right, we should discuss this in one of the many other channels we share :) @RangerMauve i'd be interested in joining the conversation, though mostly to listen since we haven't done too much here yet.
@pvh Cool, feel free to join in, and send me an email if you'd like to be added to the calendar event.
@RangerMauve i'd be interested in joining too. i'll mostly listen in and maybe fold in ideas that come to mind as the call progresses. I sent you an email :^)
@gozala I've discussed this elsewhere but I think this:
The dat community and the IPFS community have different [...] development methodology and values
is a huge reason why there isn't more interop. I look at something like this code example and I see a wall of configuration that is written in an unfamiliar style and appears to have no practical purpose. It sets up a huge amount of boilerplate and then... you have a Node
object? It doesn't explain what anything is for. I mostly see walls of text, tables, badges, org charts, and nothing means anything to me.
Compare this to something like webrtc-swarm. You set up the module with 2 pieces of information and then you can listen for 'peer'
events which give you a bidirectional stream. The module doesn't overload you with a manifesto first, it gets out of your way. I can easily see whether a module like webrtc-swarm will solve my problem or not and it doesn't try to solve all the world's problems.
The other development methodology for api design, technical communication, and setting scope leaves me unmotivated to even figure out if the given module will be suitable for what I'm trying to do. I also have no idea what libp2p is doing without reading a book's worth of content, but I can approximately guess how a module like webrtc-swarm works by glancing at its interface. Creating a mental model for the layers that sit below what you're working on is very important to design around the correct set of trade-offs, performance considerations, and failure cases. I also worry with tools that are too configurable about the tendency for those abstractions to leak upward in ways that push against encapsulation.
Yeah, I like what libp2p are trying to do, but I don't think this would be the best place to try to integrate it with Dat. I think it'd be better to talk about that somewhere relating to the work in hyperswarm since that's where all the new networking stuff in Dat is going on.
My goal of bringing it up was to figure out a scope that we should focus on and avoid over-engineering.
If down the line there's more adoption of libp2p in the Dat ecosystem, then that will definitely affect the browser, but I'd rather start somewhere small so we can help people experiment with web applications that use Dat.
Ping! The call should be starting in a minute or so. :D
Thank you all for coming out to the call! I found it really helpful to learn about your different experiences with this stuff and the use cases that you're aiming for.
Here are the notes I took during the meeting, feel free to add comments on the post for anything I missed:
Here are some action items from the meeting:
I'm going to get started on the the discovery-swarm-stream stuff mid next week with the goal of getting it integrated with dat-js and having someone test it outside of hyperdrive replication.
Re: random-access-storage. Would WebSQL perform better than IDB? @pfrazee You're using sqlite for storing dat data, any opinions regarding using it as a backend for hyperdrive?
We experimented with different random-access-* in the browser:
It was the first idea but we have problems with reading and writing a lot of blocks from hypercore.
It works great at the begin but then starts to slow down when you have >=50 blocks.
import raf from 'random-access-chrome-file'
import randomAccessKeyValue from 'random-access-key-value'
import leveljs from 'level-js'
import levelup from 'levelup'
const db = levelup(leveljs('dbname'))
const storage = file => randomAccessKeyValue(db, file);
It works really well, we don't find performance issues but don't want to have only support from chrome.
We are using webrtc through discovery-swarm-webrtc and having different issues like unusual disconnections.
We are trying to stabilize the connection :disappointed: and one of the issues that we found is related to signalhubws.
If the ws client loses the connection it doesn't try to reconnect, so we did a fork of signalhubws to use sockette
and fix this kind of issues: https://github.com/geut/signalhubws
Probably we are going to do a PR to the original project and discuss the changes there.
@tinchoz49 Appreciate you sharing that research. I'm fairly sure that Chrome is pushing for their files APIs to become standard. It might be a good bet in the long run.
@RangerMauve It's worth taking a look at, to be sure.
@tinchoz49 Have you tried out discovery-swarm-stream yet?
The chrome files api is really nice and fast, I also recommend using it despite its dependency on Chrome. I hope it becomes standard! We are using it with a map tile downloader for mapeo.
My (somewhat limited) take on web sql is that it might make sense for metadata lookup, but could be heavy for file storage with lots of blocks.
Ah yeah, I believe we funded random-access-chrome-file. Glad to hear it works in Chrome and doesn't require a Chrome App specific API. :)
@tinchoz49 Have you tried out discovery-swarm-stream yet?
It's our first priority for tomorrow.
I'm fairly sure that Chrome is pushing for their files APIs to become standard. It might be a good bet in the long run.
The chrome files api is really nice and fast, I also recommend using it despite its dependency on Chrome. I hope it becomes standard! We are using it with a map tile downloader for mapeo.
That is really interesting thanks for sharing your experience. Right now we are building a demo for the next edcon and we need to have working the browser storage persistence for that day. So, I'm going to talk about using random-access-chrome-file with the team tomorrow.
Some thoughts from my side:
This leads to one of the core issues - the official Dat networking stack is not web-compatible. Until this client can directly communicate with web peers centralisation will be required to bridge web swarms with node ones. Does the current roadmap for Dat-node consider this issue? AFAIK hyperswarm is similarly tied to requiring direct TCP and UDP socket access.
This is the Web platform's issue, not Dat's. We can't solve it without new Web APIs.
This leads to one of the core issues - the official Dat networking stack is not web-compatible. Until this client can directly communicate with web peers centralisation will be required to bridge web swarms with node ones. Does the current roadmap for Dat-node consider this issue? AFAIK hyperswarm is similarly tied to requiring direct TCP and UDP socket access.
This is the Web platform's issue, not Dat's. We can't solve it without new Web APIs.
I've spoken with several folks at the Chrome team about this. Chrome Apps have access to raw UDP/TCP sockets (with some bugs/caveats) and they expressed intent to provide similar APIs for the platform in the future. (I confess I am somewhat dubious about this but it's better than nothing.)
I should note that the Chrome networking APIs were frustratingly not-quite-compatible with node ecosystem libraries and it caused significant friction. In particular we ran into problems such as where socket configuration had to be run at different times than in Node (I believe Chrome required up-front configuration and Node didn't support the configuration until after a connection was established?) and several frustrating but more minor bugs with things like bugs in the support for multicast preventing us from using mDNS successfully.
Also, thanks to @Gozala who has been working to solve these problems on the Mozilla side. It doesn't sound like his effort will result in a new standard at this point but he's certainly helped raise awareness and driven some progress on that front.
I have been using random-access-idb-mutable-file in dat-webext now for a while with no issues. I have not tested performance, but it feels subjectively better than random-access-idb.
Maybe we can think to build something like a "random-access-web-file" for dat-js that use the Chrome file system api in Chrome or chromium based like Brave and IDBMutableFile in Firefox.
I have been using random-access-idb-mutable-file in dat-webext now for a while with no issues. I have not tested performance, but it feels subjectively better than random-access-idb.
Maybe we can think to build something like a "random-access-web-file" for dat-js that use the Chrome file system api in Chrome or chromium based like Brave and IDBMutableFile in Firefox.
That is what I was suggesting on the call yesterday. If that works it's sounds like an easy wind.
In long term however I think Dat community needs to either:
I'm biased towards no 2 mostly because I know how difficult it is to make progress on the browser end, not because they don't care, combination of billions of users & complex, old codebase makes you evaluate things on different merits.
Also, thanks to @Gozala who has been working to solve these problems on the Mozilla side. It doesn't sound like his effort will result in a new standard at this point but he's certainly helped raise awareness and driven some progress on that front.
Thank you for the kind words @pvh I do however want to point out that I don't think adding TCP / UDP / MDNS to the web stack is desired outcome, if browsers do it that is signal that they gave up on the web platform.
I really don't want ads start discovering local network services through mdns or trying to print ads
I think desired outcome IMO looks more like beaker or farm where browser allows applications to read / write data into some namespace and takes care of the underlying networking on user behalf.
Also, thanks to @Gozala who has been working to solve these problems on the Mozilla side. It doesn't sound like his effort will result in a new standard at this point but he's certainly helped raise awareness and driven some progress on that front.
Thank you for the kind words @pvh I do however want to point out that I don't think adding TCP / UDP / MDNS to the web stack is desired outcome, if browsers do it that is signal that they gave up on the web platform.
Can you expand on why you feel this way? I think of this as being similar to the emergence of wasm and fast canvas implementations. Essentially, it enables the unbundling of browser functionality and empowers communities to innovate without relying on browser vendors to do the work.
Essentially I am advocating for browsers to follow the advice of the Atlantis paper and let folks innovate not just on programming & rendering models but also communication & network protocols.
I really don't want ads start discovering local network services through mdns or trying to print ads
Same! I would expect this kind of behaviour would look a lot like notifications or any other kind of explicit opt-in browser feature.
I think desired outcome IMO looks more like beaker or farm where browser allows applications to read / write data into some namespace and takes care of the underlying networking on user behalf.
I think we're still years from knowing what the protocols should be, let alone standardizing them across vendors. If we have to wait for the answers to those questions before we can start testing these technologies at non-enthusiast scales I think that's a huge obstacle.
Still, I don't plan to wait around. I'll keep working on Electron apps & gateways and hoping browsers catch up someday.
Can you expand on why you feel this way? I think of this as being similar to the emergence of wasm and fast canvas implementations. Essentially, it enables the unbundling of browser functionality and empowers communities to innovate without relying on browser vendors to do the work.
Both wasm and canvas are fully contained in the sandboxed, in fact wasm is even more so than JS. Today the major struggle on the browser end is to to somehow contain data leakage fueling data economy of the web. Adding more low level IO primitives will only make that problem far worse and more difficult to address.
Essentially I am advocating for browsers to follow the advice of the Atlantis paper and let folks innovate not just on programming & rendering models but also communication & network protocols.
I have not read the paper, will do and maybe it convinces me otherwise.
Same! I would expect this kind of behaviour would look a lot like notifications or any other kind of explicit opt-in browser feature.
Yeah but numerous studies have shown that prompts do not work, majority of users will click through whatever prompts you provide to just get through. In fact notification prompts had being overly abused.
In practice user prompts only works if that is rare instance, but I'm certain it will be a pandoras box just like notifications prompts. And if some shitty site user needs to visit does not work and insists on UDP socket many users would accept the risk and reinforce the behavior.
Another argument is if every single site rolls out own p2p protocol and own data storage layer etc etc.. we end up with silos just of different kind. Diversity is good but I'd argue interop is more important. I really don't want the same mess as we have with endless messaging apps where each of your contact is on a different one.
I think we're still years from knowing what the protocols should be, let alone standardizing them across vendors. If we have to wait for the answers to those questions before we can start testing these technologies at non-enthusiast scales I think that's a huge obstacle.
I agree, however I think there should be a middle ground between testing by opening UDP sockets on arbitrary web site and waiting for browsers to standardize. I though extensions might provide a space to do so. More recently I've being thinking that some companion app (flash player of p2p web) could be a more effective way to do so.
Still, I don't plan to wait around. I'll keep working on Electron apps & gateways and hoping browsers catch up someday.
I really hope so! Browsers isn't where innovation happens but rather standardization of the innovation that has happened
Hi all, I've just released discovery-swarm-web which acts as a proxy to discovery-swarm. With this it should be possible to find peers for any hypercore based application you want.
With regards to storage, I've made random-access-web which will automatically try to use the Chrome File API, or IDBMutableFile API if they're present and fall back to idb and random-access-memory.
We're also finishing up a new release of dat-js that uses these two modules. Combined, I think it makes dat-js feel a lot faster.
@allain has been looking into getting Dat running in react-native using nodejs-mobile an regular RN, so part of that exploration made it more obvious why an HTTP API for Dat would be useful. Essentially, there was a local gateway (running dat-gateway for now) which was going to be used to render content to a webview. However, we also wanted to support the DatArchive API for Beaker apps. It would have been nice to have a standard for doing that. At some point I'd like to implement something and combine it with the local discovery swarm stream server that comes with discovery-swarm-web. Maybe also paired with a standard pinning service.
@tinchoz49 from @geut is working on a new version of discovery-swarm-webrtc that should have higher reliability and hopefully overall better performance. We're also thinking about how to reduce the overall number of connections to avoid some of the performance issues of WebRTC connections.
Hi! I'll try this tomorrow. Thanks for sharing!
IPFS has a https-gateway used as
https://ipfs.io/ipfs/\<hash\>
. The existence of such gateway gives the following benefits:@RangerMauve has implemented a http(s) gateway based on pfrazee/dat-gateway working in a way similar to ipfs (repo, demo). It even redirects each page to a subdomain to insulate cookies and provide better security.
Concerning datbase.org — I couldn't get it to serve me a html page the right way (without the header and maybe with absolute css paths which is not critical).
I kindly ask the DAT team to take a view on the work @RangerMauve and similar works and provide users with an official DAT https gateway, which we may use to build our future censorship-resistant websites.