Closed RangerMauve closed 6 years ago
I've started working on this feature for dat-gateway
I think node-dat-archive could be easily modified to take hyperdrive instance as an argument to have it skip creating a dat instance altogether to work in the browser.
Then the client side could be something like
function loadArchive(url) {
// Get key from URL
const drive = hyperdrive(memoryStore, {
discoveryKey: keyFromURLHere
}
const archive = new NodeDatArchive(url, {
archive: drive
})
return archive
}
From there you could have the extension manage keys for whatever storage was needed.
Pair that with a public gateway out of the box, and people don't even need to install anything to get dat working in their browser!
I'm a bit unsure about making this api available over a public interface, as this would enable others on the network to potentially modify your private Dats.
Currently, on the native-messaging branch, I'm experimenting with having the browser launch the gateway server, and then having communication between the extension and gateway over stdio. So far I've been able to use this to implement the DatArchive.resolveName
function, so the extension can check if hostnames have dat addresses (see https://github.com/sammacbeth/dat-fox/blob/native-messaging/bridge/index.js#L15).
This method has the added advantage that the user does not have to spawn the gateway process manually. Once the binary is properly installed in the browser, it should work seamlessly.
Just because it's a network socket doesn't mean that it's available to the public internet. You could make it listen on 127.0.0.1 in addition to the port which will restrict the traffic to the local machine.
In addition, the gateway isn't storing the private keys for your data. Its only acting as a proxy to the rest of the network, your private keys can be stored inside the extension's local storage and the extension can connect to the gateway when it wants to update a dat (which will propagate to the rest of the network)
I really like the prospect of not having to pay ch the gateway though.
Are you going to have one process for dat or multiplex multiple replication streams over stdin?
I was leaning towards using sockets because it will allow for the most code reuse and will allow us to use a public gateway to prevent users from even having to install something in the first place.
Plus, embedding the gateway into dat-desktop would make it easy for users to decide when to turn it on and potentially help them with choosing which archives should be pinned.
I think both patterns make sense. One could envisage multiple modes of operation, depending on the user's setup:
I think I can move the project forward with the capability to afford these different options. I myself would probably focus on a native messaging approach, as this makes most sense to me for my use-case. These implementations should be interchangeable within the extension though.
Regarding native messaging, the protocol is fairly simple (more information here) and I'm not sure it supports multiplexing. In general we shouldn't need to push too much data over this channel as it will just be for metadata. Actual data from dat can be fetched over http.
With regards to your patterns:
but this can be a carrot to push them to switch to a local gateway
If you take a look at the PR I just did to dat-gateway, this will actually allow us to have read-write of a dat archive from within the browser, and then using the gateway to sync with the greater network.
The process would look like
This way we have full CRUD with the caveat that the gateway will see more traffic form replication streams and the dat only being advertized while the browser is actively replicating or the gateway has it in its cache
Local gateway via native messaging. Could be installed by running a script which sets up the binary and manifest Firefox.
Would this really be better than having a dedicated app like dat-desktop?
I'm still not sure what native messaging entails here. Are you going to be replicating dats to the browser, or is this going to be an RPC API for the various dat archive commands? That seems like a lot more effort, but it will probably be less resource hungry for the browser
Actual data from dat can be fetched over http
So you'll need an active gateway in addition to whatever is installed for the native messaging bridge? WIll that be a public gateway, or will it also be set up when the native bridge is established?
I'm not sure I follow what you propose. Does this solution require us to be able to require
dat modules on the extension side? I'm not sure how easy it is to browserify these libraries, and if we have to run a node process anyway, we might as well offload all of the dat code to that. In general the node process should be doing all of the dat work, and the browser just requests content and presents it to the user.
With native messaging, the process launched by the browser is also the gateway server. See my prototype app code here, it runs an instance of DatGateway
as well as listening for messages from the extension. The native message API actually can launch any application registered with the browser, so we could event launch dat-desktop if we wanted to.
Does this solution require us to be able to require dat modules on the extension side?
Basically, yes. Hyperdrive and the modules it uses for storage are basically pure-js so bundling them shouldn't be a challenge. The benefit is that the browser can reuse a lot of the existing code in the JS ecosystem. As I mentioned earlier, getting the DatArchive API would just require creating a hyperdrive, and shoving it into node-dat-archive
.
The benefit here, too, is that people could use this exact code without even needing an extension. So regular HTTP-served websites could have it included as a polyfill for when there's no DatArchive API being provided by the browser.
Having all the dat logic in a node process makes sense when you have a local gateway, but non-technical (or users that don't care for setting up node modules) users wouldn't bother doing so. (Or would not want to even try the extension if it was a requirement). If the extension offloads everything needed for the DatArchive API to a local node process, it means it won't work for casual users. If it uses something that purely requires the gateway (like a public gateway), then more people could be onboarded, and therefore more people will be using Dat and will add pressure for browsers to integrate with it in the long run.
I think that code is useless unless there are people to run it, so the easier we can make the onboarding experience, the more likely it is that Dat will gain mainstream adoption.
it runs an instance of DatGateway as well as listening for messages from the extension
That's awesome! Once I have my websocket changes merged into dat-gateway, the replication feature will come for free without needing anything else set up!
(edit: Sorry if I'm ranting a lot, I'm just really excited by this!)
@RangerMauve @sammacbeth I'm loving this brainstorm guys. I had just opened up an issue in @pfrazee's dat-gateway issue queue under the same name and then I found this issue. https://github.com/pfrazee/dat-gateway/issues/5
I'm a bit unsure about making this api available over a public interface
I feel you @sammacbeth, there needs to be some kind of security. Perhaps an API for authorizing origins to other dats?
Here's my progress on a DatArchive implementation that makes use of the websocket feature I added to dat-gateway: dat-archive-web
You can test it out by running my fork of dat-gateway and running npm run example
I've fixed up issue with dat-archive-web. It's working fully now.
A dat can be created client-side, then replicated to the gateway using websockets. The client side has full ownership of the dat, and the gateway only exists to advertise it on the network (while its in the gateway's cache)
dat-fox could extend the DatArchiveWeb class to make .create
persist the data to something like indexedDB and something to keep track of keys in the plugin to support stuff like DatArchive.selectArchive
@RangerMauve Wow! This sounds great. Trying to get it rolling in an example but hitting a blocker. I've got your two repos cloned, installed, and the gateway running. Bundled the dat-archive-web repo per the docs and included that bundle in an example repo where it gets included by a very boring index.html.
Here's the example repo https://github.com/rjsteinert/dat-archive-web-example
Perhaps a broken bundle because of a conflicting node version? I'm running node v9.5.0.
Hey, the repo only works in Node.js at the moment.
The issue you're seeing is due to the graceful-fs
library being imported by node-dat-archive
.
I got the build working by adding a browser.js
file which defines the global variable and using this browserify command: browserify -r fs:graceful-fs -s DatArchiveWeb -e ./browser.js > bundle.js
I was working on the web example today as well, but I've encountered a problem with dat-dns not working in browsers.
I was going to work on it tomorrow and get rid of dat-dns support entirely until I can find a way to make it work (maybe a new gateway feature).
I've got an example with a working build (but not working DNS) in my gh-pages branch.
You might be able to use DatArchive.create()
, though. I've got to stop for today, but I'll work on it more tomorrow.
I've set up a public gateway at gateway.mauve.moe:3000
so you don't even need dat-gateway installed locally. It's the lowest tier Digital Ocean droplet so don't expect amazing performance. :P
Getting it to work might be as simple as getting rid of this line which uses dat-dns and renaming the name
argument to url
Exciting stuff @RangerMauve. Glad to hear you are making a example site to play around in.
Here's what I'm seeing on my end.
Yeah, that's a problem stemming from the lack of dat-dns support.
You could try the fix I proposed with getting rid of the dat-dns stuff entirely, but I think I'll have it running tomorrow either way.
I've got it running here
You should wait a few seconds after creating an archive for it to sync up with the remote.
Woohoo!
Looks good! I cleaned up the repo a bit yesterday and am now starting to tackle the injection of the API into Dat pages and communication channel between the page, extension and gateway. Ideally, the calls to the gateway should be done from the background script context of the extension. This will prevent cross-origin rules and CSP from breaking the API.
Once the injection and messaging is working, I'll grab what you have on dat-archive-web and see if it will run in the extension.
injection of the API into Dat pages
That would be rad! But in case that's tough, as an application developer I'd be fine with including a dat-archive.js
polyfill that falls back to the gateway when window.DatArchive
is not available.
@rjsteinert already did a PR for adding CSP headers to dat-gateway so it wouldn't be as necessary.
You could inject a browserified bundle of dat-archive-web, and some JS that invokes DatArchive.setGateway()
to set the gateway URL (if it isn't using localhost:3000
).
On top of that you could inject something that will talk to the extension to support DatArchive.selectArchive
and patch DatArchive.create()
to have the extension save the private keys for later and somehow save the archive data.
Maybe the extension could have an always active set of DatArchive instances connected to the gateway which keep those archives in the cache.
@rjsteinert would you have time to go on the dat gitter channel to talk about this stuff? I'm thinking about what sort of interface I should add to DatArchiveWeb to save credentials.
Injection seems to work fine using the method IPFS Companion use: the content-script injects a script into the page which adds the API to the window
object, then opens a communication channel to the extension background script. I have this working on my branch.
In this prototype the resolveName
message is received by the background and can then be invoked in that context, or if native messaging is being used, can ask the node process to serve the request. In the latter case the extension is just a thin client which passes all Dat tasks to the gateway process. As there is only a little boilerplate required to wire these things I think I can implement the API very quickly now using this pattern and node-dat-archive
on the gateway.
I'm working on refactoring dat-archive-web
so that you can essentially "plug in" how it works.
My goal was to make something with post-message-frame to talk to an iframe that will manage everything, but I think this will make it easier for you to plug in the communication to the extension, too.
I want to address the following concerns:
I think that the gateway can be used for dat-dns by sending a request to http://localhost:3000/mydatthing.com/.well_known/dat
since that is basically what's happening when we send a request to the domain itself and the gateway has CORS headers enabled
With regards to de-duplication and private key management, I think dat-archive-web should have a complementary "service" that it talks to which can be pluggable based on the environment (gateway, extension, bunsen browser)
The service would communicate via streams to do the following:
This should make the service fairly simple and can leave the client to do the following:
storage
for the hyperdrive to use the service storageThis should put a lot of the heavy lifting into existing modules like dat-archive-web and random-access-storage
. That way the actual services only need to care about the storage and authorization.
This will also allow me to work on a service that works from the dat-gateway itself within an iframe.
That way people could have all the benefits (except privacy :P) of using beaker without having to install anything, but with a path to have full class support via extensions and specialized browsers.
@RangerMauve @sammacbeth @chrisekelley The four of us have been hitting this problem pretty hard in the past week. Would you all be available for a WebRTC chat on talky.io (https://talky.io/dat-gateway) to sync up tomorrow Tuesday April 17 at 2pm ET?
I'm busy weekdays from 9:00 EST to about 18:00 EST, so I don't think I could. Weekends work best for me, to be honest. (Always available for texts, though :P )
@RangerMauve @sammacbeth Unfortunately 18:00 EST is too late for @chrisekelley (he lives in Barcelona), but Chris has given me the ok to try and connect with you two and I'll do my best to relay any plans.
@sammacbeth Is 18:00 EST a good time for you today?
I see that may be even later for @sammacbeth as he is listed being in Munich Germany. I suppose @chrisekelley / @sammacbeth / myself could meet at 12:00 ET while @RangerMauve / myself meet at 18:00 ET.
Shall we try those times tomorrow, Wednesday April 18?
12:00 ET could work for me, actually. I can do it on my lunch break.
12:00 ET should work for me tomorrow.
Great! @chrisekelley is in for tomorrow at 12:00 ET as well. See y'all at https://talky.io/dat-gateway
I refactored dat-archive-web
to be more extendable.
It just needs a way to plug in storage and a replication feed using what I'm calling a manager
I've updated my demo with a manager that uses persistent storage in IndexedDB and my public gateway. Live demo
This should make it easy to implement services that have their own opinions on how to store the data and replicate it as well as stuff like selectArchive
and DatDNS.
So I managed to convert the timezone incorrectly, so will already have to leave at 12:10 ET... If anyone can make an earlier start, that might be better.
I can start earlier if you want.
Great chat! Still processing everything I learned. One take away was it sounded like @sammacbeth and @RangerMauve might be taking different directions in the DatArchive approach? Even if different perhaps it could be the same dat-gateway codebase? To get us going on the Bunsen front, we baked in RangerMauve's fork of dat-gateway so we could give some of the demos he's been working a try in Bunsen.
Sorry to jump in like that :). I wasn't aware of the hipe arround dat-fox
immediately :D.
I really like the native messaging approach as it'll be totally transparent for the end user. Just install an extension and you're good to go. No gateway to configure (public) or to install (local). This native script could even be packages so that it could run without the need of having nodejs pre-installed (ease maintenance, versions mismatch etc.).
AFAIR, the only thing preventing hyperdrive
from working in the browser is that it's random-access-storage
needs access to the file system.
In general we shouldn't need to push too much data over this channel as it will just be for metadata. Actual data from dat can be fetched over http.
You mean fetched over http through a gateway? This means the native script would in fact be the gateway?
Wouldn't it be easier if you had a random-access-storage
in the browser that would use the native script to write/read? (I'm currently experimenting this approach)
might be taking different directions in the DatArchive approach? Even if different perhaps it could be the same dat-gateway codebase?
This is what I read here as well but that's a good thing. IMHO, we're all experimenting a lot with dat and it's usage and our goal is to help people with no computer knowledge to actually get access to the network.
About the gateway there are a few approaches there as well. On my early work with a websocket daemon I was also able to run dat in the browser only by using RPC calls to get acces to the file system. To me, the native messaging system enables even more ways to reach the same goal!
To clarify on the different directions, my probably over simplified summary is it sounded like @RangerMauve is thinking of taking a "thick client" approach where the browser does most of the work and communicates with either a local or public gateway where @sammacbeth is taking more of a "thin client" approach where the gateway will do most of the operations and be very important to be locally hosted and controlled.
Does that sound about right? Apologies if I missed the mark.
@soyuka The approach with having random-access-storage and hyperdrive in the browser is what I'm taking. I modified dat-gateway to have a websocket server which would create hyperdrive replication streams. This lets me sync a hyperdrive with the network by piping its replication into the WS without having to interact with the discovery swarm (the gateway does that instead).
The dat-fox extension is still going to need a gateway installed locally, it's just that the gateway will be doing more work and can have tighter integration with the OS than a browser could.
@rjsteinert I think your assessment is correct. :D
@soyuka The approach with having random-access-storage and hyperdrive in the browser is what I'm taking.
I read your code and figured that out yes :).
I'm thinking it would be good to start a repo that's just for discussing DatArchive/Beaker stuff in the context of other browsers to have it all in one place (and so we don't spam dat-fox too much :P ). Does that sound appealing?
I'm thinking it would be good to start a repo that's just for discussing DatArchive/Beaker stuff in the context of other browsers to have it all in one place (and so we don't spam dat-fox too much :P ). Does that sound appealing?
We should close this in favor of https://github.com/datproject/discussions/issues/84 this repository has that purpose :).
Yeah, I guess dat-fox provides DatArchive as of the recent PR that @sammacbeth did.
Hi, I really like where this extension was going and was wondering if you'd be interested in some discussion about how to go about enabling the DatArchive API.
I was thinking that it could be done by having a gateway which produces replication streams for dats.
What it could look like is this:
discovery-swarm
which only identifies one peer, the gateway.This will simplify accessing new Dats, but does not provide a mechanism for creating and seeding new ones.