The Future of "accessing API of remote IPFS node"

lidel commented 5 years ago

Started as a discussion between @lidel & @olizilla (2018-12-19)

Granting access to local or remote node remains a challenge both on UX and security fronts. This is an attempt to plot possible paths for going forward.

Disclaimer: below is not a roadmap, but a "what if" exercise to acts as a starting point for discussion and experimentation that follows in comments

Initial idea is to think about the problem in three stages:

Stage 1: window.ipfs.enable(opts)

ETA: ~Q1 2019
[x] postMessage-based IPFS API Proxy exposed under window.ipfs by ipfs-companion
[x] Access controls done by ipfs-companion (https://github.com/ipfs-shipyard/ipfs-companion/issues/454)
- [x] Bulk permission prompt (https://github.com/ipfs-shipyard/ipfs-companion/pull/619)
[ ] Replaced intrusive popup-based UX with OAUTH-like permission flow
- [ ] Request for non-sandboxed version of API displays more prominent warning

Stage 2A: Opaque Access Point with Service Worker

[Ongoing research]

ETA: 2019+

[ ] Thin static HTML+JS is loaded to establish Access Point Service Worker (APSW), which acts as a proxy to IPFS API provider and exposes limited API/Gateway endpoints

[ ] Progressive peer-to-peer Web Applications (PPWA) talk to IPFS over APSW

[ ] APSW automatically picks the best IPFS provider (js-ipfs, remote/local HTTP API, ipfs-companion)

Stage 2B: HTTP/WS /api/v1/ with access controls

Bit speculative - work on /api/v1 did not start yet, we are collecting requirements

ETA: 2019? 2020?

[ ] Websites and apps access API of IPFS Node directly

[ ] Access controls done by IPFS Node itself, CORS are allowed by default (*)

[ ] /api/v1/ can start as an experimental overlay provided by ipfs-desktop

OAUTH-like flow introduced in Stage 1 remains the same

Real-time capabilities are supported over Websockets

[ ] window.ipfs in ipfs-companion implemented as a preconfigured js-ipfs-http-client rather than a proxy

The overhead of postMessage is removed

Access controls removed from ipfs-companion and now done by ipfs daemon itself

Stage 3: Nodes talking to each other over libp2p

This is highly speculative idea with a lot of details to figure out, but the general idea to replace legacy transports (HTTP, Websockets) with libp2p

ETA: 2020+

Prerequisites:

[ ] pubsub is enabled by default and works in browser contexts

[ ] ipfs-companion == IPFS node (eg. runs embedded node js-ipfs by default)

[ ] window.ipfs.enable() (and future API-provider libraries) give access to API from Stage 2 over p2p connection (eg. via ipfs p2p)

[ ] "follow" semantics exist and allow setting up various sync policies between nodes

Parking this here for now, would appreciate thoughts in comments below.

mitra42 commented 5 years ago

Stage 2 is when it gets interesting. Stage 1 requires installing ipfs companion, and then any browser based application detecting the presence of both ipfs companion and the local IPFS nodes, it complications to the point of being unlikely to happen.

If Stage 2 - or some version of it was implemented - then for example the dweb.archive.org UI could detect the presence of a local node and use that as a persistent cache rather than using js-ipfs with all the limitations that come from running in the browser (including lack of persistence after browser window closed and the extreme load on CPU which encourages people to close pages that are running IPFS;)

Obviously relying on CORS in a content-addressed filesystem makes no sense to me since both trusted & untrusted content could come from anywhere (e.g. from https://ipfs.io), one option I think would be worth considering along with authentication would be allowing a subset of the API to run without authentication - e.g. get, add, urlstore, pin , while saving more sensitive operations (like editing the config) until authentication was implemented.

lidel commented 5 years ago

@Gozala shared some relevant ideas in Progressive peer-to-peer web applications (PPWA). I need to think about this more, but my gut feeling is Stage 2 could be refined by introducing sw/iframe-based API provider as the universal entry point.

We could do access control there (before it lands in the actual API), and also iterate on graceful fallback / opportunistic upgrade mechanisms (eg. internally using window.ipfs if ipfs-companion is present, or trying local node directly via js-ipfs-http-client before falling back to js-ipfs).

@mitra42 we started experimenting with a subset of the API to run without authentication in ipfs-companion's window.ipfs proxy, current whitelist is here. The lack of permission prompt comes at a price of risking rogue website preloading malicious content to your node via dht.get or finding out your identity by adding unique content and doing dht.findprovs. It is also possible in the old web and XHRs, but in IPFS node is also sharing preloaded data, which may be problematic in some scenarios.

mitra42 commented 5 years ago

We really dont want to be running this through ipfs-companion. We want to run IPFS in the web browser and have the libraries (js-ipfs and js-ipfs-api) integrated in the page so that the user doesnt' NEED to do anything other than visit the page, but we do want to take advantage of a local peer if one exists. I acknowledge the risks, but I think they are much smaller than the loss of functionality from not being able to use a local IPFS peer at all, or even worse the current situation where people running a peer have the choice between not being able to use it for anything local (leaving CORS on) or exposing themselves to all kinds of malicious attacks by turning CORS off since there is no authentication even for damaging activities.

fiatjaf commented 5 years ago

To me it seems that IPFS Companion is great, because it enables opt-in. I really don't want websites using my local IPFS node just because I have one. But if I enable IPFS Companion then I'm telling then they can.

At the same time IPFS Companion abstracts way the need to inject IPFS libraries and/or do manual calls to the IPFS API from webapps that may use a local IPFS node. You can just use window.ipfs (if it is present and allowed) and that's it, otherwise you don't use it, or fail entirely and tell the user about it.

Gozala commented 5 years ago

To be clear what I was suggesting is to make say companion.ipfs.io facilitate pretty much what ipfs companion add-on does today through service worker. If you also happen to have addon installed sw could laverage that as well.

As of opting-in / permissions companion.ipfs.io could do that based in client origin

mitra42 commented 5 years ago

@fiatjaf and @Gozala - I can't figure out how to make either of those suggestions work in practice. Assume a website (such as dweb.archive.org that wants to run in any situation, it can bundle js-ipfs and (js-ipfs-api) but it cant require users to download anything. We have code to try and autodetect in our IPFSAutoConnect function at [https://github.com/internetarchive/dweb-transports/blob/TransportIPFS.js#L81]. It fails in most cases currently because the local IPFS peer refuses CORS.

A vanishingly small portion of the users will have IPFS companion installed because (as far as I can tell) it doesnt add anything unless they want to interact with IPFS directly. Some might have IPFS or a nearby IPFS node as part of the dweb-mirror project. We could include the IPFS code from ipfs-companion into the Wayback Machine extension which a larger number will have installed, but we haven't had anyone (volunteer or paid) with the bandwidth and browser-extension expertise to either bundle js-ipfs directly into our extension, or bundle some part of ipfs-companion and figure out all the browser limitations.

Gozala commented 5 years ago

@fiatjaf and @Gozala - I can't figure out how to make either of those suggestions work in practice. Assume a website (such as dweb.archive.org that wants to run in any situation, it can bundle js-ipfs and (js-ipfs-api) but it cant require users to download anything. We have code to try and autodetect in our IPFSAutoConnect function at [https://github.com/internetarchive/dweb-transports/blob/TransportIPFS.js#L81]. It fails in most cases currently because the local IPFS peer refuses CORS.

I am building a proof of concept of proposed idea. I'll be happy to share it here once it's ready.

Gozala commented 5 years ago

I’ve put together a prove of concept that attempts that proposed idea is possible. There are some good news and bad news. I’ll start with what I have working:

https://github.com/gozala/lunet

I have static site that installs APSW - Access Point Service Porker. At the moment it just acts as a proxy to native app service, but obviously it could also talk to gateways and what not in the future.
Above static site also serves bridge.html file that is meant to be embedded through an iframe. It expects message from embedder with a MessagePort and forward that to APSW along with an origin of the embedder. That allows APSW to check permissions per origin and either accept connection, reject or consent a user.
I have simple systray app that exposes REST API on https://127.0.0.0:9000. On first run it also generates self signed SSL key+certificate and adds it to trusted roots on the system. That way it can serve over HTTPS but sadly Firefox does not consult system whether root is trusted so it does require you to Accept the Risk and Continue.
At the moment I have https://lunet.link DNS records configured to resolve to 127.0.0.1 so if you do have npm run remote running loading that URL will install APSW.
I also have a demo on https://ipfs.io/ipfs/QmSYickcuNoda1ZNShbtT5WpRxMG1jqzUqEypYYpvYBLAY/ that will attempt to connect to APSW installed by https://lunet.link and do all the ceremony to serve content from systray app. P.S.: I expect that apps sites would instead have their own custom domains.

As of bad news:

Keeping APSW alive is virtually impossible. I have imperfect workaround in place that will repeat reconnect flow if APSW is no longer alive. I think it can be improved enough to be smooth.
Another problem with APSW lifetime is that it could be terminated as it serves data through MessagePort, that is because SW are only kept alive if they are responding to fetch / install / activate events. Which is the problem because if it serves say video stream through MessagePort it can be terminated in the process. Pub / Sub is also going to be problem :(

I think there is a way to overcome this limitation by keeping iframe around and making it poll SW on regular bases. Unfortunately that is far from ideal, mostly because app will need to consciously keep some iframe in the DOM. It might be still possible to make it smooth by having some script that creates and adds that iframe into head rather than body and does some coordination between SW and iframe, but I was really hoping for a smoother experience.

Gozala commented 5 years ago

I made little more progress in my prototype:

I have updated DNS records now and the static site that does APSW registration is now hosted on gh-pages https://github.com/Gozala/lunet/tree/master/docs

Wanted to host it on IPFS itself, but I did not managed to get reasonable deployment option that would keep it up, have https on my domain, reflect changes in reasonable timeframe.
In theory IPFS HTTP API should be sufficient, but because it's http it only works on chrome. So I still have native app that pretty much acts as an https proxy for IPFS HTTP API (assuming one runs on http://127.0.0.1:5001/) so it would work on Safari & Firefox.

I think it would make most sense to do the self signed certificate trick from the IPFS HTTP API itself and extend it in few ways to handle permissions. @lidel I'd like to participate in API v2 design if possible.

So with npm run local running & IPFS daemon running (with Access-Control-Allow-Origin configured to respond to https://lunet.link) I'm able to access IPFS content through the Service Worker, in fact I'm able to load webui and it seems to work with no changes (except of safari because it blocks http://127.0.0.1 from https, should be fairly easy to fix webui would just need to talke to SW instead).

Disclaimer: I need to fix how SW updates, right now only way to get it updated to manually unregister from devtools and then load https://lunet.link so it can install a fresh one.

Gozala commented 5 years ago

Next thing I want to do is create another site say https://gozala.io/webui-demo that would embed lunet.link to host just webui.

BTW I think IPFS-HTTP-API would need to learn picking up some config changes through API itself. Like ideally https://gozala.io/webui-demo during first run will do oauth like flow with http://lunet.link and through that configure IPFS-HTTP-API so that Access-Control-Allow-Origin will include https://gozala.io/ origin.

Gozala commented 5 years ago

After more research I am considering an alternative approach, I think it would work better better than current approach where App SW needs to connect to Daemon SW approach because SWs are really eager to terminate and that problem is multiplied by the fact that we're trying to have Daemon SW alive and connected to the app SW, as they both race to terminate either of them succeeding breakes a MessageChannel which also happens to be impossible (without hacks) to detecting) on the other end.

This is why I'm considering an alternative approach

Daemon site (one that is embedded in iframe) will spawn a SharedWorker (and fall back to Dedicated Worker pool if API is not available, Thanks Apple 😢). This way we don't have to fight Daemon SW to keep it alive, as long as one Daemon page is around worker will be able to keep the connection alive. In practice that should be the case as long as there is at least one active client app. Only case that is not true if all apps have being closed and later you do open one and that case is fairly easy to detect (SW has no clients) in which case it can serve page that just embeds Daemon iframe and once connection between Deamon Worker and SW is established then redirect to the action page that was requested (Please note that this sounds complicated, but that is what is happening in current setup and works remarkably well).

It does imply that client apps need to embed Daemon iframe or else corresponding worker will terminate. However that was more or less a problem already, and I was already considering to workaround that by appending to navigation responses. Additionally in that added markup can be used to do user prompting for permissions (and it needs to be with-in the iframe so privileges can't be escalated).

This approach has additional advantage for in browser node case as frequent terminations don't exactly mix with well with that.

Trickiest bit is going to support browsers without SharedWorker API. In that case idea is following once iframe with Daemon loads it will say "hello" on BroadcastChannel if there any document that has already spawned a Worker (lets call it supervisor) it will respond back with a MessagePort connected to an own worker and index it was assigned (by is incrementing) . If no one responds in short time frame document assumes a supervision and starts index. Supervisor on beforeunload event broadcasts "goodby" message with an index of next supervisor being nominated, at which point next on in line spawns worker and acts as supervisor. Every document messages supervisor on beforeunload so supervisor can nominate new supervisor on exit. That does mean that worker lifetime is inconsistent, however even in worst possible scenario it would be probably still better than SW already is.

It is also worth considering that if Daemon manages to connect to a companion add-on or a local Daemon through REST API there will be no need to even spawn any workers. Still there will be some extra work to consider like propagating content added to the in worker node to the local Daemon.

Edit: Not sure what I was supposed to follow this ~~Unfortunately it~~

lidel commented 5 years ago

This is great. I've been thinking what developer-facing artifacts could be extracted from this and I think drop-in library/toolkit that acts as a replacement for standalone js-ipfs is a way to go, as it should help with addressing two high level problems:

"Running the same website (Origin) in multiple tabs without spawning multiple instances of js-ipfs"
- Every website runs their own node once per Origin(s)
- No user prompt, hardcoded access control: simple ability to whitelist multiple Origins to share the same worker would make various deployments a lot easier.
"Global, shared ipfs instance that can be used by any Origin" (the original endgoal)
- Possible to run own, but most of the people will use default provided by the library
- User prompt for access control.

@Gozala I agree that SharedWorker is worth investigating. To remove need for access control and keep things simpler we may want to focus on (1) initially, as security perimeter is easier to understand.

Unfortunately it

...? (the suspense is killing me :sweat_smile:)

Gozala commented 5 years ago

...? (the suspense is killing me 😅)

Oops I'm not sure how my comment ended like that, nor I can remember if there was anything specific I was going to say. Sorry

Gozala commented 5 years ago

I spend little more time on this and currently implemented something in between what I originally made and alternative option I described. Current status is: Things work really well on Chrome and Firefox but I'm struggling to identify the issue with Safari.

At the moment setup looks as follows:

Client App / Site

Client site in the example I had https://gozala.io/peerdium needs to serve two files:

index.html that bootstraps everything up. It looks like this:
```
 <meta name="mount" content="/ipfs/QmYjtd61SyXU4aVSKWBrtDiXjHtpJVFCbvR7RgJ57BPZro/" />
 <script type="module" async src="https://lunet.link/lunet/client.js"></script>
```
Where lunet/client.js does a ceremony of embedding https://lunet.link in iframe. And registering service worker ./lunet.js (second file described below) path is also configurable via meta tag.
./lunet.js (second file described below) just imports the https://lunet.link/lunet/proxy.js that takes care of serving content under the mounted path (as seen in meta tag). Meaning that https://gozala.io/peerdium/index.html will map to /ipfs/QmYjtd61SyXU4aVSKWBrtDiXjHtpJVFCbvR7RgJ57BPZro/index.html and will be served through a client by means of an iframe it set up. lunet.js looks as follows:
```
importScripts("https://lunet.link/lunet/proxy.js")
```

In terms of interaction this is what happens:

index.html is loaded, which sets up an embedded iframe + SW.
Once all is set client.js will fetch location.href this time it will go through SW and there for response will be for /ipfs/QmYjtd61SyXU4aVSKWBrtDiXjHtpJVFCbvR7RgJ57BPZro/.
Document tree is updated with a response. It is important to update document rather then reload because SW will message client (that will be this document) which then obtains response through embedded iframe. If you were to reload there would be no client for SW to get data through. In fact if you navigate to any page, SW will respond with <script type="module" async src="https://lunet.link/lunet/client.js"></script> that will do the same thing fetch own location and update document with response.

Host

Document that client embeds in iframe is what I refer to as host. Host document is also pretty much just this <script type="module" src="htts://lunet.link/lunet/host.js"></script> and is what is being served under https://lunet.link, which is to say that interesting stuff happens in lunet/host.js which is:

It registers lunet.link/service.js SW.
It listens on message events from the embedder.
Adds listener to message.ports[0] which is what lunet/client.js will pass during init.
Messages on ports are just serialized Request instances which it de-serializes and passes onto own SW, which in turn talks to ipfs-desktop (in the future will also have embedded js-ipfs) to get response and forwards response to a requesting message port, that lunet/client.js will then forward to lunet/proxy.js SW.

Wishlist

Here are the things I would like to change about this setup

As you can see only piece that matters in the client app is the IPFS path. Everything else is pretty static. Ideally it should just take dnlink TXT record should be all it takes.
On one hand host should not need to register SW because in practice SharedWorker would do a better job here. I still have it though so that it can load https://lunet.link/lunet/host.js while offline, however it would make sense to figure out a way to do it without SW.

Gozala commented 5 years ago

It turns out Safari does not implement BroadcastChannel either so my SharedWorker polyfill ideas isn't going to work out :(

Gozala commented 5 years ago

Alright I think something else could be done on Safari (or anywhere where SharedWorker isn't available but ServiceWorker is) we can spawn a Service Worker, which once activated will start broadcasting ping to all it's clients, that in turn will respond with pong message back, and keep repeating this

const extendLifetime = async() {
  await sleep(1000 * 60 * 4) // Firefox will wait for 5mins on extendable event than abort.
  const clients = await self.clients.matchAll({ includeUncontrolled: true })
  for (const client of clients) {
     client.postMessage("ping")
  }
  await when("message", self)
}

const when = (type, target) =>
  new Promise(resolve => target.addEventListener(type, resolve, {once:true})

self.addEventListener("activate", event => event.waitUntil(extendLifetime())
self.addEventListener("message", event =>  event.waitUntil(extendLifetime())

I believe this should keep Service Worker alive and going as long as there are clients talking to it, which is in fact a case for SharedWorker.

Gozala commented 5 years ago

Got it working across Firefox, Chrome & Safari!

lidel commented 5 years ago

This is fantastic, especially getting it work on Safari :+1:

I really like the mount metaphor and how small is the amount of code end developer needs to put on the static page. This is exactly what we should aim for.

@Gozala Regarding the first item from your Wishlist:

As you can see only piece that matters in the client app is the IPFS path. Everything else is pretty static. Ideally it should just take dnlink TXT record should be all it takes.

We have an API for DNSLink lookups, but may want to support <meta> header as an optional fallback. Something like this:

Try to get the latest value for the mount point by reading DNSLink:
- ```
ipfs.dns(window.location.hostname, {r: true})
.then(dnslinkPresent)
.catch(dnslinkMissing)
```
  - r - recursive lookup should be default in this use case to avoid issues described in https://github.com/ipfs/go-ipfs/issues/4293#issuecomment-416310364
  - Note: in browser context we don't have access to DNS, so js-ipfs makes lookup via public gateway behind the scenes (example)
If (1) returned error (no DNSLink or API down) then fallback to version from <meta> (if present)

ps2. I see how a hybrid approach could be supported as well, where static HTML with regular website is returned with one extra <script>, then PPWA library replaces document with more recent version read from DNSLink.

Gozala commented 5 years ago

@lidel I’ve considered doing dns lookup instead of meta tag (as per your suggestion), however goal is for user not to have to static site for bootstrapping in first place, basically I want flow to be ipfs add -r ./ and adding hash to dns record. Which is to say ideally gateway would just serve bootstrap page and lunet.js

Gozala commented 5 years ago

For now I have shifted my focus toward figuring / fixing issue on Firefox that prevents requests from secure contexts (from https) to http://127.0.0.1:* in worker contexts (See Bug 1520381). Self-signed certificates in Firefox (unlike Chrome and Safari) still require users to take additional actions providing a very poor experience.

lidel commented 5 years ago

@Gozala Is there a way we could remove the need for self-signed certificates? I am afraid adding stuff to system cert vault simply won't work in many environments. It may also get our apps automatically blacklisted in various enterprise-y places due to "malicious behavior".

So far my ideas are:

(A) Is it feasible to get a valid certificate for https://localhost.lunet.link, then switch DNS to point at localhost and set up tray app to use it the cert when talking to the browser?
(B) What if ipfs-companion provides an API object that serves the purpose of https://127.0.0.1:9000/? I am highly interested in use case where embedded js-ipfs running in IPFS Companion is able to open IPFS pages without touching public gateway. Right now js-ipfs has a limited use in browser context, because it can't open TCP port for Gateway functionality, so we need to use public one. I wonder if PPWA approach coupled with https://github.com/ipfs/js-ipfs/issues/1820 could solve this.

Gozala commented 5 years ago

@Gozala Is there a way we could remove the need for self-signed certificates? I am afraid adding stuff to system cert vault simply won't work in many environments.

It is needed on Safari, so I would assume only Mac (are people actually using Safari on Windows?). I do not know about Edge or whether it's worth caring about it as it's moving to Chromium.

I have being using https://www.npmjs.com/package/tls-keygen to do that on first run and it does seem to work just fine on my Mac, however that does not mean it will work everywhere. Maybe doing only on Mac or doing it on opt-in would be a reasonable compromise ?

It may also get our apps automatically blacklisted in various enterprise-y places due to "malicious behavior".

I'm not sure I have anything to address this, however certificate can be issued only for 127.0.0.1 which means it doesn't create any attack vector, although I do not know if that is argument that will work in enterprise.

I also suspect Safari is not a choice for enterprise so maybe it's not that relevant ?

* (A) Is it feasible to get a valid certificate for `https://localhost.lunet.link`, then switch DNS to  point at localhost and set up tray app to use it the cert when talking to the browser?

Problem is you'd have to ship cert + key with you app which can then be extracted and used by evilhacker to deploy an attack vector.

* (B) What if ipfs-companion provides an API object that serves the purpose of `https://127.0.0.1:9000/`?

👍 I was implying that. Idea behind https://lunet.link is to act as router that can use whatever means available to provide access to IPFS network. I would expect that ipfs-companion could be one option, desktop app will be another, in-browser node yet another.

In fact ideally all IPFS based desktop / mobile apps would embed a core enabling something like https://lunet.link to take advantage of the bundled libp2p.

I am highly interested in use case where embedded js-ipfs running in IPFS Companion is able to open IPFS pages without touching public gateway. Right now js-ipfs has a limited use in browser context, because it can't open TCP port for Gateway functionality, so we need to use public one. I wonder if PPWA approach coupled with ipfs/js-ipfs#1820 could solve this.

Sorry I'm little confused here. So primary intent of https://lunet.link (which really should be just https://bridge.ipfs.io/ or something like that) is to provide read / write access to IPFS network without public gateways by leveraging native app or companion extension. If neither is available only then falling back to either gateway you configured it to fallback to or more likely gateway.ipfs.io because you have not.

Gozala commented 5 years ago

@lidel This is the good article https://letsencrypt.org/docs/certificates-for-localhost/ that goes into details of what options are available in terms of certificates for localhost.

Gozala commented 5 years ago

There is actually one more thing I have being thinking about, essentially what I'm trying to do is reincarnate "Flash Player" (I know it pains me as well). Flash was there to fill the gap in the web platform and had a relatively good on-boarding story, further more it took only one webapp to convince you to install it and all the other apps were able to leverage that. In a way that is what I'm aiming for any ipfs based system app will fill the gap of the modern web-platform, it can be ipfs-desktop or textile-photos or maybe whatever next thing will be, at the end of the day if you have one than rest of the web will be able to take advantage of the IPFS network without gateways.

This got me thinking can we actually use Flash player itself ? I know it's wild and days of flash are counted, but would it be too wild to polyfill network stack using flash ?

lidel commented 5 years ago

Probably not, Flash is dead. Adobe will stop maintaining it in 2020, but browsers such as Chrome will remove support for it in the middle of 2019.

JS-IPFS node embedded in IPFS Companion being the "Flash Player of IPFS" is more likely.
Some websites already ask users to install Metamask and IPFS Companion :)

Gozala commented 5 years ago

I wrote down more elaborate explanation of how things work in my setup: https://hackmd.io/s/r1ovevAfN#

I also have made following changes:

Lunet no longer uses ServiceWorker instead it moved to SharedWorker for reasons explained in prior updates.
I removed a need for custom native app, instead lunet just talks to local daemon + gateway. This means you should be able to try this out peerdium demo with a doc that @lidel shared with me eariler https://gozala.io/peerdium/#df041973629a2eba1f5f3803QmXFnedrR7s9xNk4AUmtvjcmcW2Ks9XhobMyS6pHyXs4oE
Bug 1520381 prevents Firefox from talking to http://127.0.0.1 from Shared (or any other kind of) Worker. As workaround SharedWorker on failure will ask document (it received request from) to do a fetch for it. Not ideal, but better than being blocked.
Since IPFS daemon / gateway don't speak HTTPS and since I removed my local HTTPS proxy things no longer work on Safari. Another reason this would not work in Safari is the lack of SharedWorker.

Next steps

I have a plan to workaround lack of SharedWorker in Safari using ServiceWorker. Idea is that each iframe embedded lunet doc will spawn / connect to ServiceWorker and engage in non-stop ping / pong exchange. That will keep ServiceWorker running as long as there is any doc is engaged with it. And when there is non it will terminate, which is also the case with SharedWorker so I think it's most promising route.
Safari not talking to http is another major roadblock. Without self signed certificates I don't see any other way to overcome it. Using Flash player is only option I can see, but as @lidel pointed out it's not really viable one so, if you have one I'd love to hear it.

Gozala commented 5 years ago

Safari not talking to http is another major roadblock. Without self signed certificates I don't see any other way to overcome it. Using Flash player is only option I can see, but as @lidel pointed out it's not really viable one so, if you have one I'd love to hear it.

I forgot about WebRTC, if I recall correctly there was a way to connect peers through WebRTC without signaling servers on the same machine, which might be one way how say IPFS-Desktop and Safari could get connected.

Gozala commented 5 years ago

So I am able to connect my Firefox and Safari by manually copy & pasting offer and answer RTCSessionDescriptions between instances. I'm not entirely sure what what the sdp format is, but assuming it can be manually assembled, in which I'm not confident it would allow connecting website with one in ipfs-desktop.

RangerMauve commented 5 years ago

In case you find a way around the dtls fingerprint thing, I've been using this for parsing SDP: https://github.com/clux/sdp-transform

Gozala commented 5 years ago

Turns out Safari is deliberately blocking access to http://127.0.0.1, Bug 171934 has relevant conversations, sadly not very constructive from either end. @lidel also unless I'm overlooking an sarcasm from the Apple representative he seems to suggest that installing Root CA is what they'd prefer over allowing talking to loopback address (quoting corresponding comment) below:

Installing a trusted certificate doesn't sound so bad.

It also turns out Edge has a fix in place to allow talking to loopback addresses: https://developer.microsoft.com/en-us/microsoft-edge/platform/issues/11963735/

Gozala commented 5 years ago

In the Safari bug thread there is also pointer to post that exploits window.open as means to overcome the limitation. I'm not convinced that having extra tab around is a reasonable route, but posting it here just in case.

Gozala commented 5 years ago

Another idea which I'm not sure how viable it may be is to do following:

Whenever IPFS node is created create corresponding User which would roughly translate to something like ${peerID}.ipfs-peers.id.
Create DNS record ${peerID}.ipfs-peers.id pointing to 127.0.0.1.
Obtain TLS certificate+key for ${peerID}.ipfs-peers.id (through Let's Encrypt or something like that)
Use node's public key to encrypt TLS key + certificate and put it on IPFS network, so that IPFS peer can get it and store TLS key + certificate in own repo directory.
IPFS desktop on first run / when accessing webui could load lunet.link?${peerID} so that peer info can be obtained and used to communicate with loopback address.

I think there is still a risk that attacker might attempt to create a peer to obtain TLS key + certificate for a subdomain. However as I understand it (and I'm by no means qualified) is not a useful vector to target users. On the other hand attacker might attempt to use obtained keys as means to reduce credibility of the ${peerID}.ipfs-peer.id which in turn might cause ipfs-peer.id to loose certificate breaking access for everyone.

Gozala commented 5 years ago

I did some more research on the Safari. I think most viable route is:

Bundle Safari App Extension with ipfs-desktop.
Use that app extension to inject scripts for lunet.link that would expose access to Daemon through a Messaging API.

In case extension did not provided access assume ipfs-desktop isn't installed and use in browser peer. In fact ipfs-companion browser extension probably does the same.

This addresses my primary goal - of not having to ask user to install multiple things as app will deliver extension for Safari and other browsers will be able to talk to app without extension mediation.

This is also seems like a viable route for mobile Safari. On Android I would expect browsers would be able to talk to http://127.0.0.1 but I have not tested that, also from what I know on Android it is possible to run local server as a service.

Gozala commented 5 years ago

I've put some effort into getting in-browser node as a backup solution for when daemon isn't available. Which basically means doc in an iframe get's message that is effectively a Request representation for IPFS Daemon REST API. Which it serves by either:

Forwarding request to a daemon on loopback address and forwarding Response back.
Forwarding to an in-browser IPFS node running inside a SharedWorker and messaging result back.

Things at my disposal are

https://github.com/ipfs/js-ipfs-http-client https://github.com/ipfs/js-ipfs

Which is problematic due to following reasons:

Ideally there would be something that abstracts across these two possibly attempting to Daemon and fall back to SharedWorker. (However argument could be made that's what I'm doing)
So service worker produces Request object that iframe can forward to Gateway if it's available. However if it is not, it needs to serialize request into some message corresponding to an IPFS API call, then on the SharedWorker side deserialize that and dispatch to the corresponding IPFS API call, then create some result object that iframe will then transform into Response that SW can serve.

I think it would make much more sense if there was something like IPFSDaemon() API that would just take Request complying with REST API and return Response also corresponding to REST API. That way it would avoid multiple serialization, deserialization, dispatch, etc... steps.

Appendix

I forgot to mention that then on the client app end it's likely will have another js-ipfs-http-client that talks to SW so that would be another round of the same encode + decode + dispatch which is why I think it would make sense to have a lower level API that is in par with Daemon REST API so the encode / decode / dispatch happens only at the edges instead of at every link of the request flow.

Gozala commented 5 years ago

Partially related to my last comment. In Mozilla Devtools protocol I worked on a thing that allowed server to describe it's protocol during connection via passing down an API spec, that then was used by a client side to generate full client to avoid having to maintain client that more or less just encode / decode / track exchange and occasionally gets out of sync with server or breaks due to some human error. It also made it possible to generate clients in other languages.

IPFS API would be even easier as there is no notion of GC-able objects in the server process (which complicates things) that you want to reflect in the client. How visible is it to compel stakeholders to do something along those lines ?

lidel commented 5 years ago

I think it would make much more sense if there was something like IPFSDaemon() API that would just take Request complying with REST API and return Response also corresponding to REST API. That way it would avoid multiple serialization, deserialization, dispatch, etc... steps.

@Gozala I fully agree. The code responsible for "REST API" (HTTP Gateway) in js-ipfs is disabled in browser context because it can't open TCP port, but we could expose it as a JS function to simplify use in contexts like Service Worker. See my proposal in https://github.com/ipfs/js-ipfs/issues/1820, if you have ideas on how should the browserified Gateway API should look like, comment there.

[..] passing down an API spec, that then was used by a client side to generate full client to avoid having to maintain client. [..] How visible is it to compel stakeholders to do something along those lines ?

It sounds like something worth discussing when IPFS API v2 is designed this year. I am sure any idea that could decrease maintenance burden of js-ipfs-http-client, ipfs-postmsg-proxy etc will be taken into consideration. Will CC you when there is a meta-issue on this.

Gozala commented 5 years ago

Status update

I spend more time on this to get the in-browser fallback working. It took quite a bit more effort than I anticipated but good news is it works. Below is the an image of fully functional webui loaded via lunet from ipfs through in-browser node & 0 changes (to the webui code)

My peerdium demo also works with in-browser node with 0 changes as well. 🎉

At the moment this version lives in a separate branch because:

I commented out local gateway proxy version to ease the development.
Code could use some cleaning up.
Need to add logic that will try optimistic path with local gateway and degrade to in-browser node version.

Details

I end up replicating a lot of logic from js-ipfs that deals with Daemon / Geatway REST endpoint handling. Better long term solution would be to factor out relevant pieces from js-ipfs as suggested by https://github.com/ipfs/js-ipfs/issues/1855 and make it server / browser agnostic. I have considered doing that, but decided that it was not most effective path for making process on this exploration as there are bunch of non-trivial constraints to consider.
Host document (one in the iframe) spawns a SharedWorker

Open questions (would love feedback)

If local gateway is unavailable host document will spawn an in-browser node, but what if local node was temporarily down and it does come back up later ? Should we occasionally try to connect to local node ?
User should never loose data or system is broken! This is really tricky to get right I'm afraid, specifically challenges being:
- What to do when user used in-browser first and then native IPFS node. It's possible possibly to replicate data from in-browser node to the native one, but it does not seem trivial to deal with conflicts or timing in case there is a lot of data to be pushed to.
- What to do when user primarily uses native node, but later in-browser node. This is trickier situation as presumably native node is unreachable so data can by synced. Maybe in-browser node should always be active and replicating ?
I'm inclined to think that some lightweight version of in-browser node should always be there if not replicating data at least maintain list of relevant CIDs so that in case of native node being down it can still present a user with a consistent library & lazily attempt to fetch relevant data off of network.
- Daemon REST API is not optimized for browser clients. For example tar encoding file lists are served on some requests, that require both server and client to use library to deal with that. On the client side that requires pulling in additional library for very little benefit, on the server it matters less, however given that in-browser node serves same API through Worker it not only requires that library on the other end but also wastes CPU cycles first encoding then decoding. Can we redesign v1 version to be optimized for a client instead, succh that plain fetch would be very reasonable choice.
OMG sooo nodejs! js-ipfs is definitely optimized for node, and undoubtedly that made a lot of sense at a time. However there is huge amount of code bringing node stuff that then you need to adapt to browser like streams, buffers, pull-streams, http... Browsers have come a long way since though, most of these have if not superior than definitely a good enough built-in alternatives. Any chance we could embrace all this progress in browser land ? It would both make it easier to use it in browser and reduce size of the library.

Gozala commented 5 years ago

I'm inclined to think that some lightweight version of in-browser node should always be there if not replicating data at least maintain list of relevant CIDs so that in case of native node being down it can still present a user with a consistent library & lazily attempt to fetch relevant data off of network.

Should I be looking at IPFS Cluster stuff for this stuff ?

MidnightLightning commented 5 years ago

@Gozala What to do when user used in-browser first and then native IPFS node. It's possible possibly to replicate data from in-browser node to the native one, but it does not seem trivial to deal with conflicts or timing in case there is a lot of data to be pushed to.

Should I be looking at IPFS Cluster stuff for this stuff?

I found this issue through searching around related to a concept I've been thinking more about. That idea of both an in-browser node and a native/standalone node I think is something that should be fleshed out as a user norm. Using as an example how users interact with services like Dropbox, I have Dropbox clients on my desktop, laptop, and phone, and have different files "starred" for offline use on my phone than I use most frequently on my laptop or desktop. I think it would be ideal that among the standard peer discovery methods that any given IPFS node (in-browser or standalone) has, that it additionally allows a user to indicate another node as "theirs" (add authentication/credentials?), and then those nodes actively sync pins/virtual filesystem structures between them. In that way, I could have a standalone node running on my workstation, and when I open a browser on my workstation, laptop, or phone, they all create an in-browser node, and I end up with four nodes that are all "me" and storing my data. I'd probably want to configure the in-browser node on my workstation to do minimal storage (since there's another node on that machine that should be primary), and would like the control to indicate my workstation node should pin/keep a copy of everything the others pin (primary backup), the laptop should as well, when it's online (secondary backup), and the phone node would only pin important things (space concerns), but being able to browse "known" hashes/files on the workstation/laptop nodes would be ideal.

From that perspective, it would be fine if all in-browser nodes stayed in-browser nodes (no need to "change over" to a standalone node if it came back online), but pinned/known file syncing could be very useful.

mikeal commented 5 years ago

I'm inclined to think that some lightweight version of in-browser node should always be there if not replicating data at least maintain list of relevant CIDs so that in case of native node being down it can still present a user with a consistent library & lazily attempt to fetch relevant data off of network.

One question I have that might bring some clarity to these questions: what is the delta between what we think an ideal “integrated-in-browser-IPFS-node” and a js-ipfs service worker?

The reason I’d like to think about things this way is that this excercise might surface differences between “features missing in the web platform” that we don’t have without native integration and features the platform doesn’t have because of legitimate security and isolation concerns between applications.

The security story for the current locally running server (either Go or IPFS Desktop) is practically non-existent. Having a similarly scoped shared resource will need to drastically improve that security story and it’s not yet clear to me where this is the responsibility of IPFS or if we’re actually missing a feature or integration in the browser.

Gozala commented 5 years ago

One question I have that might bring some clarity to these questions: what is the delta between what we think an ideal “integrated-in-browser-IPFS-node” and a js-ipfs service worker?

I can only speak for myself and what I think & going for is "browser is your IPFS node". JS-IPFS, SW, IPFS-Desktop etc... are just polyfills to deliver / explore that experience.

The reason I’d like to think about things this way is that this excercise might surface differences between “features missing in the web platform” that we don’t have without native integration and features the platform doesn’t have because of legitimate security and isolation concerns between applications.

I think there is general assumption that web platform lacks features to implement full fledged IPFS node in the web content context. I think it's an incorrect way to look at things. Even if browsers exposed all the low level networking primitives (which is highly unlikely) to allow that, each browser tab running own IPFS node would be a terrible experience.

That is to suggest that if / when IPFS is adopted by a browser, browser itself will become IPFS node and expose limited API to access & store content off of network. And yes it will impose same / similar origins separation concerns as it does today.

Goal of this exploration is to polyfill described experience through variety of tools available:

If IPFS-Desktop is available use that as it provides actual p2p access to the network.
Otherwise fallback to the JS IPFS and routing servers instead.

That way it applications:

Can be loaded of IPFS network
Be isolated via origin
Have a way to read / write data (based on the origin)

The security story for the current locally running server (either Go or IPFS Desktop) is practically non-existent.

Yes and that is a huge issue waiting to be exploited. I would absolutely encourage to lock it down. Last time I checked ipfs deamon / gateway comes with default of Access-Control-Allow-Origin: * which means any site could take control of it and exploit it.

Both should be locked down to a one single origin maybe access.ipfs.io or something along those lines. Same is true for the routing services that js-ipfs connects to they should be open only to a single origin controlled by PL.

Having a similarly scoped shared resource will need to drastically improve that security story and it’s not yet clear to me where this is the responsibility of IPFS or if we’re actually missing a feature or integration in the browser.

No features are missing on that end. What this PoC does is uses special origin (in this case lunet.link, but should be access.ipfs.io or something) that provides access to the IPFS network and mediates access control with user. This way it is able to control access rights based on users consent rather than allow each app / site do whatever they want. This also allows user to access library of all data in one place. While that is what browser should be doing, I think it is worth doing it on IPFS end right now to exercise this and learn from experience and also establish a cowpath.

Gozala commented 5 years ago

I have cluster branch now which runs in-browser node and attempts to use local native node through REST API simultaneously. At the moment it's pretty dumb it just forwards requests to both nodes and attempts to serve response from native node with fallback to in-browser node.

At the moment there is no attempt to sync two, for that it would probably make most sense to borrow the logic from ipfs-cluster rather than trying to hack things together.

I'll focus on getting this working in Safari through SW poly-fill for SharedWorker and then deploy current version.

Gozala commented 5 years ago

Responding to @MidnightLightning

Using as an example how users interact with services like Dropbox, I have Dropbox clients on my desktop, laptop, and phone, and have different files "starred" for offline use on my phone than I use most frequently on my laptop or desktop.

That is good point! However case with native node vs in-browser node is different as from user point of view it's the same device.

I think it would be ideal that among the standard peer discovery methods that any given IPFS node (in-browser or standalone) has, that it additionally allows a user to indicate another node as "theirs" (add authentication/credentials?), and then those nodes actively sync pins/virtual filesystem structures between them.

I have being thinking about this in a slightly different way. I imagine library organized as "collections" (or threads in Textile terms). Ideal is that you can invite others to collaborate on those collections. Those others can be your other devices, your friends or pinning services.

In that way, I could have a standalone node running on my workstation, and when I open a browser on my workstation, laptop, or phone, they all create an in-browser node, and I end up with four nodes that are all "me" and storing my data. I'd probably want to configure the in-browser node on my workstation to do minimal storage (since there's another node on that machine that should be primary), and would like the control to indicate my workstation node should pin/keep a copy of everything the others pin (primary backup), the laptop should as well, when it's online (secondary backup), and the phone node would only pin important things (space concerns), but being able to browse "known" hashes/files on the workstation/laptop nodes would be ideal.

I think all those use cases fit nicely with an above described solution, further more it follows the interaction flow. User during sharing / publishing phase chooses who to share it with.

Implementation vise it seems that "collection" should just be a "ipfs-cluster".

lidel commented 5 years ago

@Gozala I like the idea of a seamless/self-healing abstraction for browser context (Access Point Facade), but figuring out how to handle the surface of IPFS API when API provider is a facade on top of multiple nodes going online and offline will be a challenge.

Agree we should look at ipfs-cluster for inspiration, but security considerations will not fit exactly. eg. in browser we want to build security perimeter around Origin and based on it introduce key/mfs write/read scoping/sandboxing, limit access to sensitive endpoints such as ipfs.config etc

Seems that the MVP would need:

add,cat, ls, refs, object, dag, block – for low level file, dag and block operations
- i think dag will supersede object at some point but for now we need both
scoped per origin: files – for per-dapp storage
scoped per origin: name + key – for publishing to IPNS
- potential open issue: how to share publishing key between nodes? key import/export is only supported by js-ipfs, go-ipfs only allows to create a new key
some API to control "follow"/"sync" between nodes behind the Access Point Facade – even if it is just add/remove for new node

Gozala commented 5 years ago

@Gozala I like the idea of a seamless/self-healing abstraction for browser context (Access Point Facade), but figuring out how to handle the surface of IPFS API when API provider is a facade on top of multiple nodes going online and offline will be a challenge.

I agree. I am also getting more and more convinced that exposing full IPFS API may not be a good idea in first place. While it is cool to have webui running over this, I think it's a wrong abstraction for most apps.

Agree we should look at ipfs-cluster for inspiration, but security considerations will not fit exactly.

I need to write a coherent story about the experience I have in mind, but before I'll get around to doing it here is a gist:

API exposed allows embedder to save data into library without any permission requests. However data has quota (to reduce abuse) per origin. Such data is local to a device (even though local go-ipfs and in browser js-ipfs will be syncing in case one goes down). Think of it as local "draft" data.
There will be an API to publish "draft". On publish request lunet will trigger user interaction (under it's own origin, to prevent embedder from escalate privileges). Think of it as "save as..." except user will have a choice to publish publicly under own identity or privately / with a group in which case actual data is encrypted and published to IPFS. Providing user a URL that embeds both CID+Decryption key. On the recipient side key will be saved and data decrypted on reads by lunet. Goal is that app knows nothing about encryption / description going on at either end of the pipe.
During publishing flow user will be able add "bots" that are essentially replication nodes forming an IPFS cluster. Actual UX needs quite a bit thinking through, but general idea is that there will be kind of "groups" (or threads in textile terms) that form IPFS-Cluster to which you publish. Which is to say IPFS clusters will just need to orchestrate replication & they don't need to have full IPFS API.

eg. in browser we want to build security perimeter around Origin and based on it introduce key/mfs write/read scoping/sandboxing, limit access to sensitive endpoints such as ipfs.config etc

Absolutely! However I think that should happen at the lunet (Access Point Facade) level before any calls are issued to any of the IPFS nodes.

On the sandboxing I'm still working out some details in my head but I think there is a real opportunity to improve on the mess we're in the conventional web by limiting read / write access to only a app resources / document being operated on. Largest issues on the web are due to third parties doing tracking and aggregating user data on some servers. I think it would be really great if we enforced a setup as something like ://${app}/${data} where app maps to some CID and there for app is able to read / load stuff from it. And ${data} is MFS entry that app is able both read / write to.

In this setup app can't really spy on user, sure it can save some data however that data is local and user personally needs to choose to share it and even then app isn't really able to let own server know where to grab it from.

There are things to be worked out but I'm inclined to think that combination of SW & sandbox-ed iframes might allow for such sandboxing.

Gozala commented 5 years ago

I got it finally working in Safari 🥳 (Debugging SW in Safari is quite a throwback to old days of JS with no debuggers, except there’s no alert or reliable way to print output either 😂)

Now peerdium fork loads with no changes, using in-browser node running SharedWorker polyfill using ServiceWorker.

Issues

However aome content, like posts created by me seem to fail to load, specifically js-ipfs call ipfs.get(cid) returns promise that never resolves nor fails. I also do corresponding request to a (gateway?) server which succeeds with 200 but still nothing from js-ipfs and given no reliable debugging in SW on Safari I was unable to track down what’s the issue. Also same code with a same codepath works as expected in Firefox and Chrome so 🤷‍♂️

I’ll do more digging tomorrow, but thought I’d post in case this is known issue

Gozala commented 5 years ago

Safari also seems to reject POST requests with form data as body. Edit: It also appears that FormData isn't available is ServiceWorker context in Safari, which might be a an underlying reason.

Gozala commented 5 years ago

It seems that in Safari call to resolver.cid(ipfsNode, "/ipfs/QmedJqYfddTzygxpcDrtTJBupHn4qGHntPXBx8APNM5gE1") never resolves or fails which is from ipfs-http-response library, and specifically:

https://github.com/ipfs/js-ipfs-http-response/blob/7746dab433e3a57652e3222bb1cc6051a09576be/src/index.js#L63

Gozala commented 5 years ago

I am exploring attentive approach for loading this described here https://github.com/Gozala/lunet/issues/2#issuecomment-463786508

Winterhuman commented 1 year ago

Just to partly revive the discussion, https://datatracker.ietf.org/doc/draft-ietf-dnsop-alt-tld seems like an interesting idea to remember. It reserves .alt as a non-DNS namespace which applications can reserve for themselves, like CID.ipfs.alt leading to a local IPFS node if desired (and it says the names don't need to be DNS compliant!)

ipfs / in-web-browsers