ipfs / in-web-browsers

Tracking the endeavor towards getting web browsers to natively support IPFS and content-addressing
https://docs.ipfs.tech/how-to/address-ipfs-on-web/
MIT License
348 stars 29 forks source link

The Future of "accessing API of remote IPFS node" #137

Open lidel opened 5 years ago

lidel commented 5 years ago

Started as a discussion between @lidel & @olizilla (2018-12-19)

Granting access to local or remote node remains a challenge both on UX and security fronts. This is an attempt to plot possible paths for going forward.

Disclaimer: below is not a roadmap, but a "what if" exercise to acts as a starting point for discussion and experimentation that follows in comments

Initial idea is to think about the problem in three stages:

Stage 1: window.ipfs.enable(opts)

Stage 2A: Opaque Access Point with Service Worker

[Ongoing research]

  • ETA: 2019+
  • [ ] Thin static HTML+JS is loaded to establish Access Point Service Worker (APSW), which acts as a proxy to IPFS API provider and exposes limited API/Gateway endpoints
  • [ ] Progressive peer-to-peer Web Applications (PPWA) talk to IPFS over APSW
  • [ ] APSW automatically picks the best IPFS provider (js-ipfs, remote/local HTTP API, ipfs-companion)

Stage 2B: HTTP/WS /api/v1/ with access controls

Bit speculative - work on /api/v1 did not start yet, we are collecting requirements

  • ETA: 2019? 2020?
  • [ ] Websites and apps access API of IPFS Node directly
  • [ ] Access controls done by IPFS Node itself, CORS are allowed by default (*)
  • [ ] /api/v1/ can start as an experimental overlay provided by ipfs-desktop
  • OAUTH-like flow introduced in Stage 1 remains the same
  • Real-time capabilities are supported over Websockets
  • [ ] window.ipfs in ipfs-companion implemented as a preconfigured js-ipfs-http-client rather than a proxy
  • The overhead of postMessage is removed
  • Access controls removed from ipfs-companion and now done by ipfs daemon itself

Stage 3: Nodes talking to each other over libp2p

This is highly speculative idea with a lot of details to figure out, but the general idea to replace legacy transports (HTTP, Websockets) with libp2p

  • ETA: 2020+
  • Prerequisites:
  • [ ] pubsub is enabled by default and works in browser contexts
  • [ ] ipfs-companion == IPFS node (eg. runs embedded node js-ipfs by default)
  • [ ] window.ipfs.enable() (and future API-provider libraries) give access to API from Stage 2 over p2p connection (eg. via ipfs p2p)
  • [ ] "follow" semantics exist and allow setting up various sync policies between nodes

Parking this here for now, would appreciate thoughts in comments below.

mitra42 commented 5 years ago

Stage 2 is when it gets interesting. Stage 1 requires installing ipfs companion, and then any browser based application detecting the presence of both ipfs companion and the local IPFS nodes, it complications to the point of being unlikely to happen.

If Stage 2 - or some version of it was implemented - then for example the dweb.archive.org UI could detect the presence of a local node and use that as a persistent cache rather than using js-ipfs with all the limitations that come from running in the browser (including lack of persistence after browser window closed and the extreme load on CPU which encourages people to close pages that are running IPFS;)

Obviously relying on CORS in a content-addressed filesystem makes no sense to me since both trusted & untrusted content could come from anywhere (e.g. from https://ipfs.io), one option I think would be worth considering along with authentication would be allowing a subset of the API to run without authentication - e.g. get, add, urlstore, pin , while saving more sensitive operations (like editing the config) until authentication was implemented.

lidel commented 5 years ago

@Gozala shared some relevant ideas in Progressive peer-to-peer web applications (PPWA). I need to think about this more, but my gut feeling is Stage 2 could be refined by introducing sw/iframe-based API provider as the universal entry point.

We could do access control there (before it lands in the actual API), and also iterate on graceful fallback / opportunistic upgrade mechanisms (eg. internally using window.ipfs if ipfs-companion is present, or trying local node directly via js-ipfs-http-client before falling back to js-ipfs).

@mitra42 we started experimenting with a subset of the API to run without authentication in ipfs-companion's window.ipfs proxy, current whitelist is here. The lack of permission prompt comes at a price of risking rogue website preloading malicious content to your node via dht.get or finding out your identity by adding unique content and doing dht.findprovs. It is also possible in the old web and XHRs, but in IPFS node is also sharing preloaded data, which may be problematic in some scenarios.

mitra42 commented 5 years ago

We really dont want to be running this through ipfs-companion. We want to run IPFS in the web browser and have the libraries (js-ipfs and js-ipfs-api) integrated in the page so that the user doesnt' NEED to do anything other than visit the page, but we do want to take advantage of a local peer if one exists. I acknowledge the risks, but I think they are much smaller than the loss of functionality from not being able to use a local IPFS peer at all, or even worse the current situation where people running a peer have the choice between not being able to use it for anything local (leaving CORS on) or exposing themselves to all kinds of malicious attacks by turning CORS off since there is no authentication even for damaging activities.

fiatjaf commented 5 years ago

To me it seems that IPFS Companion is great, because it enables opt-in. I really don't want websites using my local IPFS node just because I have one. But if I enable IPFS Companion then I'm telling then they can.

At the same time IPFS Companion abstracts way the need to inject IPFS libraries and/or do manual calls to the IPFS API from webapps that may use a local IPFS node. You can just use window.ipfs (if it is present and allowed) and that's it, otherwise you don't use it, or fail entirely and tell the user about it.

Gozala commented 5 years ago

To be clear what I was suggesting is to make say companion.ipfs.io facilitate pretty much what ipfs companion add-on does today through service worker. If you also happen to have addon installed sw could laverage that as well.

As of opting-in / permissions companion.ipfs.io could do that based in client origin

mitra42 commented 5 years ago

@fiatjaf and @Gozala - I can't figure out how to make either of those suggestions work in practice. Assume a website (such as dweb.archive.org that wants to run in any situation, it can bundle js-ipfs and (js-ipfs-api) but it cant require users to download anything. We have code to try and autodetect in our IPFSAutoConnect function at [https://github.com/internetarchive/dweb-transports/blob/TransportIPFS.js#L81]. It fails in most cases currently because the local IPFS peer refuses CORS.

A vanishingly small portion of the users will have IPFS companion installed because (as far as I can tell) it doesnt add anything unless they want to interact with IPFS directly. Some might have IPFS or a nearby IPFS node as part of the dweb-mirror project. We could include the IPFS code from ipfs-companion into the Wayback Machine extension which a larger number will have installed, but we haven't had anyone (volunteer or paid) with the bandwidth and browser-extension expertise to either bundle js-ipfs directly into our extension, or bundle some part of ipfs-companion and figure out all the browser limitations.

Gozala commented 5 years ago

@fiatjaf and @Gozala - I can't figure out how to make either of those suggestions work in practice. Assume a website (such as dweb.archive.org that wants to run in any situation, it can bundle js-ipfs and (js-ipfs-api) but it cant require users to download anything. We have code to try and autodetect in our IPFSAutoConnect function at [https://github.com/internetarchive/dweb-transports/blob/TransportIPFS.js#L81]. It fails in most cases currently because the local IPFS peer refuses CORS.

I am building a proof of concept of proposed idea. I'll be happy to share it here once it's ready.

Gozala commented 5 years ago

I’ve put together a prove of concept that attempts that proposed idea is possible. There are some good news and bad news. I’ll start with what I have working:

https://github.com/gozala/lunet

As of bad news:

Gozala commented 5 years ago

I made little more progress in my prototype:

So with npm run local running & IPFS daemon running (with Access-Control-Allow-Origin configured to respond to https://lunet.link) I'm able to access IPFS content through the Service Worker, in fact I'm able to load webui and it seems to work with no changes (except of safari because it blocks http://127.0.0.1 from https, should be fairly easy to fix webui would just need to talke to SW instead).

screen shot 2019-01-05 at 7 17 25 pm

Disclaimer: I need to fix how SW updates, right now only way to get it updated to manually unregister from devtools and then load https://lunet.link so it can install a fresh one.

Gozala commented 5 years ago

Next thing I want to do is create another site say https://gozala.io/webui-demo that would embed lunet.link to host just webui.

BTW I think IPFS-HTTP-API would need to learn picking up some config changes through API itself. Like ideally https://gozala.io/webui-demo during first run will do oauth like flow with http://lunet.link and through that configure IPFS-HTTP-API so that Access-Control-Allow-Origin will include https://gozala.io/ origin.

Gozala commented 5 years ago

After more research I am considering an alternative approach, I think it would work better better than current approach where App SW needs to connect to Daemon SW approach because SWs are really eager to terminate and that problem is multiplied by the fact that we're trying to have Daemon SW alive and connected to the app SW, as they both race to terminate either of them succeeding breakes a MessageChannel which also happens to be impossible (without hacks) to detecting) on the other end.

This is why I'm considering an alternative approach

Daemon site (one that is embedded in iframe) will spawn a SharedWorker (and fall back to Dedicated Worker pool if API is not available, Thanks Apple 😢). This way we don't have to fight Daemon SW to keep it alive, as long as one Daemon page is around worker will be able to keep the connection alive. In practice that should be the case as long as there is at least one active client app. Only case that is not true if all apps have being closed and later you do open one and that case is fairly easy to detect (SW has no clients) in which case it can serve page that just embeds Daemon iframe and once connection between Deamon Worker and SW is established then redirect to the action page that was requested (Please note that this sounds complicated, but that is what is happening in current setup and works remarkably well).

It does imply that client apps need to embed Daemon iframe or else corresponding worker will terminate. However that was more or less a problem already, and I was already considering to workaround that by appending to navigation responses. Additionally in that added markup can be used to do user prompting for permissions (and it needs to be with-in the iframe so privileges can't be escalated).

This approach has additional advantage for in browser node case as frequent terminations don't exactly mix with well with that.

Trickiest bit is going to support browsers without SharedWorker API. In that case idea is following once iframe with Daemon loads it will say "hello" on BroadcastChannel if there any document that has already spawned a Worker (lets call it supervisor) it will respond back with a MessagePort connected to an own worker and index it was assigned (by is incrementing) . If no one responds in short time frame document assumes a supervision and starts index. Supervisor on beforeunload event broadcasts "goodby" message with an index of next supervisor being nominated, at which point next on in line spawns worker and acts as supervisor. Every document messages supervisor on beforeunload so supervisor can nominate new supervisor on exit. That does mean that worker lifetime is inconsistent, however even in worst possible scenario it would be probably still better than SW already is.

It is also worth considering that if Daemon manages to connect to a companion add-on or a local Daemon through REST API there will be no need to even spawn any workers. Still there will be some extra work to consider like propagating content added to the in worker node to the local Daemon.

Edit: Not sure what I was supposed to follow this Unfortunately it

lidel commented 5 years ago

This is great. I've been thinking what developer-facing artifacts could be extracted from this and I think drop-in library/toolkit that acts as a replacement for standalone js-ipfs is a way to go, as it should help with addressing two high level problems:

  1. "Running the same website (Origin) in multiple tabs without spawning multiple instances of js-ipfs"
    • Every website runs their own node once per Origin(s)
    • No user prompt, hardcoded access control: simple ability to whitelist multiple Origins to share the same worker would make various deployments a lot easier.
  2. "Global, shared ipfs instance that can be used by any Origin" (the original endgoal)
    • Possible to run own, but most of the people will use default provided by the library
    • User prompt for access control.

@Gozala I agree that SharedWorker is worth investigating. To remove need for access control and keep things simpler we may want to focus on (1) initially, as security perimeter is easier to understand.

Unfortunately it

...? (the suspense is killing me :sweat_smile:)

Gozala commented 5 years ago

...? (the suspense is killing me 😅)

Oops I'm not sure how my comment ended like that, nor I can remember if there was anything specific I was going to say. Sorry

Gozala commented 5 years ago

I spend little more time on this and currently implemented something in between what I originally made and alternative option I described. Current status is: Things work really well on Chrome and Firefox but I'm struggling to identify the issue with Safari.

At the moment setup looks as follows:

Client App / Site

Client site in the example I had https://gozala.io/peerdium needs to serve two files:

  1. index.html that bootstraps everything up. It looks like this:

     <meta name="mount" content="/ipfs/QmYjtd61SyXU4aVSKWBrtDiXjHtpJVFCbvR7RgJ57BPZro/" />
     <script type="module" async src="https://lunet.link/lunet/client.js"></script>

    Where lunet/client.js does a ceremony of embedding https://lunet.link in iframe. And registering service worker ./lunet.js (second file described below) path is also configurable via meta tag.

  2. ./lunet.js (second file described below) just imports the https://lunet.link/lunet/proxy.js that takes care of serving content under the mounted path (as seen in meta tag). Meaning that https://gozala.io/peerdium/index.html will map to /ipfs/QmYjtd61SyXU4aVSKWBrtDiXjHtpJVFCbvR7RgJ57BPZro/index.html and will be served through a client by means of an iframe it set up. lunet.js looks as follows:

    importScripts("https://lunet.link/lunet/proxy.js")

In terms of interaction this is what happens:

Host

Document that client embeds in iframe is what I refer to as host. Host document is also pretty much just this <script type="module" src="htts://lunet.link/lunet/host.js"></script> and is what is being served under https://lunet.link, which is to say that interesting stuff happens in lunet/host.js which is:

Wishlist

Here are the things I would like to change about this setup

  1. As you can see only piece that matters in the client app is the IPFS path. Everything else is pretty static. Ideally it should just take dnlink TXT record should be all it takes.

  2. On one hand host should not need to register SW because in practice SharedWorker would do a better job here. I still have it though so that it can load https://lunet.link/lunet/host.js while offline, however it would make sense to figure out a way to do it without SW.

Gozala commented 5 years ago

It turns out Safari does not implement BroadcastChannel either so my SharedWorker polyfill ideas isn't going to work out :(

Gozala commented 5 years ago

Alright I think something else could be done on Safari (or anywhere where SharedWorker isn't available but ServiceWorker is) we can spawn a Service Worker, which once activated will start broadcasting ping to all it's clients, that in turn will respond with pong message back, and keep repeating this

const extendLifetime = async() {
  await sleep(1000 * 60 * 4) // Firefox will wait for 5mins on extendable event than abort.
  const clients = await self.clients.matchAll({ includeUncontrolled: true })
  for (const client of clients) {
     client.postMessage("ping")
  }
  await when("message", self)
}

const when = (type, target) =>
  new Promise(resolve => target.addEventListener(type, resolve, {once:true})

self.addEventListener("activate", event => event.waitUntil(extendLifetime())
self.addEventListener("message", event =>  event.waitUntil(extendLifetime())

I believe this should keep Service Worker alive and going as long as there are clients talking to it, which is in fact a case for SharedWorker.

Gozala commented 5 years ago

Got it working across Firefox, Chrome & Safari!

screen shot 2019-01-14 at 3 44 45 pm
lidel commented 5 years ago

This is fantastic, especially getting it work on Safari :+1:

I really like the mount metaphor and how small is the amount of code end developer needs to put on the static page. This is exactly what we should aim for.

@Gozala Regarding the first item from your Wishlist:

As you can see only piece that matters in the client app is the IPFS path. Everything else is pretty static. Ideally it should just take dnlink TXT record should be all it takes.

We have an API for DNSLink lookups, but may want to support <meta> header as an optional fallback. Something like this:

  1. Try to get the latest value for the mount point by reading DNSLink:
    • ipfs.dns(window.location.hostname, {r: true})
      .then(dnslinkPresent)
      .catch(dnslinkMissing)
  2. If (1) returned error (no DNSLink or API down) then fallback to version from <meta> (if present)

ps2. I see how a hybrid approach could be supported as well, where static HTML with regular website is returned with one extra <script>, then PPWA library replaces document with more recent version read from DNSLink.

Gozala commented 5 years ago

@lidel I’ve considered doing dns lookup instead of meta tag (as per your suggestion), however goal is for user not to have to static site for bootstrapping in first place, basically I want flow to be ipfs add -r ./ and adding hash to dns record. Which is to say ideally gateway would just serve bootstrap page and lunet.js

Gozala commented 5 years ago

For now I have shifted my focus toward figuring / fixing issue on Firefox that prevents requests from secure contexts (from https) to http://127.0.0.1:* in worker contexts (See Bug 1520381). Self-signed certificates in Firefox (unlike Chrome and Safari) still require users to take additional actions providing a very poor experience.

lidel commented 5 years ago

@Gozala Is there a way we could remove the need for self-signed certificates? I am afraid adding stuff to system cert vault simply won't work in many environments. It may also get our apps automatically blacklisted in various enterprise-y places due to "malicious behavior".

So far my ideas are:

Gozala commented 5 years ago

@Gozala Is there a way we could remove the need for self-signed certificates? I am afraid adding stuff to system cert vault simply won't work in many environments.

It is needed on Safari, so I would assume only Mac (are people actually using Safari on Windows?). I do not know about Edge or whether it's worth caring about it as it's moving to Chromium.

I have being using https://www.npmjs.com/package/tls-keygen to do that on first run and it does seem to work just fine on my Mac, however that does not mean it will work everywhere. Maybe doing only on Mac or doing it on opt-in would be a reasonable compromise ?

It may also get our apps automatically blacklisted in various enterprise-y places due to "malicious behavior".

I'm not sure I have anything to address this, however certificate can be issued only for 127.0.0.1 which means it doesn't create any attack vector, although I do not know if that is argument that will work in enterprise.

I also suspect Safari is not a choice for enterprise so maybe it's not that relevant ?

* (A) Is it feasible to get a valid certificate for `https://localhost.lunet.link`, then switch DNS to  point at localhost and set up tray app to use it the cert when talking to the browser?

Problem is you'd have to ship cert + key with you app which can then be extracted and used by evilhacker to deploy an attack vector.

* (B) What if ipfs-companion provides an API object that serves the purpose of `https://127.0.0.1:9000/`? 

👍 I was implying that. Idea behind https://lunet.link is to act as router that can use whatever means available to provide access to IPFS network. I would expect that ipfs-companion could be one option, desktop app will be another, in-browser node yet another.

In fact ideally all IPFS based desktop / mobile apps would embed a core enabling something like https://lunet.link to take advantage of the bundled libp2p.

I am highly interested in use case where embedded js-ipfs running in IPFS Companion is able to open IPFS pages without touching public gateway. Right now js-ipfs has a limited use in browser context, because it can't open TCP port for Gateway functionality, so we need to use public one. I wonder if PPWA approach coupled with ipfs/js-ipfs#1820 could solve this.

Sorry I'm little confused here. So primary intent of https://lunet.link (which really should be just https://bridge.ipfs.io/ or something like that) is to provide read / write access to IPFS network without public gateways by leveraging native app or companion extension. If neither is available only then falling back to either gateway you configured it to fallback to or more likely gateway.ipfs.io because you have not.

Gozala commented 5 years ago

@lidel This is the good article https://letsencrypt.org/docs/certificates-for-localhost/ that goes into details of what options are available in terms of certificates for localhost.

Gozala commented 5 years ago

There is actually one more thing I have being thinking about, essentially what I'm trying to do is reincarnate "Flash Player" (I know it pains me as well). Flash was there to fill the gap in the web platform and had a relatively good on-boarding story, further more it took only one webapp to convince you to install it and all the other apps were able to leverage that. In a way that is what I'm aiming for any ipfs based system app will fill the gap of the modern web-platform, it can be ipfs-desktop or textile-photos or maybe whatever next thing will be, at the end of the day if you have one than rest of the web will be able to take advantage of the IPFS network without gateways.

This got me thinking can we actually use Flash player itself ? I know it's wild and days of flash are counted, but would it be too wild to polyfill network stack using flash ?

lidel commented 5 years ago

Probably not, Flash is dead. Adobe will stop maintaining it in 2020, but browsers such as Chrome will remove support for it in the middle of 2019.

JS-IPFS node embedded in IPFS Companion being the "Flash Player of IPFS" is more likely.
Some websites already ask users to install Metamask and IPFS Companion :)

Gozala commented 5 years ago

I wrote down more elaborate explanation of how things work in my setup: https://hackmd.io/s/r1ovevAfN#

I also have made following changes:

Next steps

Gozala commented 5 years ago
  • Safari not talking to http is another major roadblock. Without self signed certificates I don't see any other way to overcome it. Using Flash player is only option I can see, but as @lidel pointed out it's not really viable one so, if you have one I'd love to hear it.

I forgot about WebRTC, if I recall correctly there was a way to connect peers through WebRTC without signaling servers on the same machine, which might be one way how say IPFS-Desktop and Safari could get connected.

Gozala commented 5 years ago

So I am able to connect my Firefox and Safari by manually copy & pasting offer and answer RTCSessionDescriptions between instances. I'm not entirely sure what what the sdp format is, but assuming it can be manually assembled, in which I'm not confident it would allow connecting website with one in ipfs-desktop.

RangerMauve commented 5 years ago

In case you find a way around the dtls fingerprint thing, I've been using this for parsing SDP: https://github.com/clux/sdp-transform

Gozala commented 5 years ago

Turns out Safari is deliberately blocking access to http://127.0.0.1, Bug 171934 has relevant conversations, sadly not very constructive from either end. @lidel also unless I'm overlooking an sarcasm from the Apple representative he seems to suggest that installing Root CA is what they'd prefer over allowing talking to loopback address (quoting corresponding comment) below:

Installing a trusted certificate doesn't sound so bad.

It also turns out Edge has a fix in place to allow talking to loopback addresses: https://developer.microsoft.com/en-us/microsoft-edge/platform/issues/11963735/

Gozala commented 5 years ago

In the Safari bug thread there is also pointer to post that exploits window.open as means to overcome the limitation. I'm not convinced that having extra tab around is a reasonable route, but posting it here just in case.

Gozala commented 5 years ago

Another idea which I'm not sure how viable it may be is to do following:

  1. Whenever IPFS node is created create corresponding User which would roughly translate to something like ${peerID}.ipfs-peers.id.
  2. Create DNS record ${peerID}.ipfs-peers.id pointing to 127.0.0.1.
  3. Obtain TLS certificate+key for ${peerID}.ipfs-peers.id (through Let's Encrypt or something like that)
  4. Use node's public key to encrypt TLS key + certificate and put it on IPFS network, so that IPFS peer can get it and store TLS key + certificate in own repo directory.
  5. IPFS desktop on first run / when accessing webui could load lunet.link?${peerID} so that peer info can be obtained and used to communicate with loopback address.

I think there is still a risk that attacker might attempt to create a peer to obtain TLS key + certificate for a subdomain. However as I understand it (and I'm by no means qualified) is not a useful vector to target users. On the other hand attacker might attempt to use obtained keys as means to reduce credibility of the ${peerID}.ipfs-peer.id which in turn might cause ipfs-peer.id to loose certificate breaking access for everyone.

Gozala commented 5 years ago

I did some more research on the Safari. I think most viable route is:

  1. Bundle Safari App Extension with ipfs-desktop.
  2. Use that app extension to inject scripts for lunet.link that would expose access to Daemon through a Messaging API.

In case extension did not provided access assume ipfs-desktop isn't installed and use in browser peer. In fact ipfs-companion browser extension probably does the same.

This addresses my primary goal - of not having to ask user to install multiple things as app will deliver extension for Safari and other browsers will be able to talk to app without extension mediation.

This is also seems like a viable route for mobile Safari. On Android I would expect browsers would be able to talk to http://127.0.0.1 but I have not tested that, also from what I know on Android it is possible to run local server as a service.

Gozala commented 5 years ago

I've put some effort into getting in-browser node as a backup solution for when daemon isn't available. Which basically means doc in an iframe get's message that is effectively a Request representation for IPFS Daemon REST API. Which it serves by either:

  1. Forwarding request to a daemon on loopback address and forwarding Response back.
  2. Forwarding to an in-browser IPFS node running inside a SharedWorker and messaging result back.

Things at my disposal are

https://github.com/ipfs/js-ipfs-http-client https://github.com/ipfs/js-ipfs

Which is problematic due to following reasons:

  1. Ideally there would be something that abstracts across these two possibly attempting to Daemon and fall back to SharedWorker. (However argument could be made that's what I'm doing)

  2. So service worker produces Request object that iframe can forward to Gateway if it's available. However if it is not, it needs to serialize request into some message corresponding to an IPFS API call, then on the SharedWorker side deserialize that and dispatch to the corresponding IPFS API call, then create some result object that iframe will then transform into Response that SW can serve.

I think it would make much more sense if there was something like IPFSDaemon() API that would just take Request complying with REST API and return Response also corresponding to REST API. That way it would avoid multiple serialization, deserialization, dispatch, etc... steps.

Appendix

I forgot to mention that then on the client app end it's likely will have another js-ipfs-http-client that talks to SW so that would be another round of the same encode + decode + dispatch which is why I think it would make sense to have a lower level API that is in par with Daemon REST API so the encode / decode / dispatch happens only at the edges instead of at every link of the request flow.

Gozala commented 5 years ago

Partially related to my last comment. In Mozilla Devtools protocol I worked on a thing that allowed server to describe it's protocol during connection via passing down an API spec, that then was used by a client side to generate full client to avoid having to maintain client that more or less just encode / decode / track exchange and occasionally gets out of sync with server or breaks due to some human error. It also made it possible to generate clients in other languages.

IPFS API would be even easier as there is no notion of GC-able objects in the server process (which complicates things) that you want to reflect in the client. How visible is it to compel stakeholders to do something along those lines ?

lidel commented 5 years ago

I think it would make much more sense if there was something like IPFSDaemon() API that would just take Request complying with REST API and return Response also corresponding to REST API. That way it would avoid multiple serialization, deserialization, dispatch, etc... steps.

@Gozala I fully agree. The code responsible for "REST API" (HTTP Gateway) in js-ipfs is disabled in browser context because it can't open TCP port, but we could expose it as a JS function to simplify use in contexts like Service Worker. See my proposal in https://github.com/ipfs/js-ipfs/issues/1820, if you have ideas on how should the browserified Gateway API should look like, comment there.

[..] passing down an API spec, that then was used by a client side to generate full client to avoid having to maintain client. [..] How visible is it to compel stakeholders to do something along those lines ?

It sounds like something worth discussing when IPFS API v2 is designed this year. I am sure any idea that could decrease maintenance burden of js-ipfs-http-client, ipfs-postmsg-proxy etc will be taken into consideration. Will CC you when there is a meta-issue on this.

Gozala commented 5 years ago

Status update

I spend more time on this to get the in-browser fallback working. It took quite a bit more effort than I anticipated but good news is it works. Below is the an image of fully functional webui loaded via lunet from ipfs through in-browser node & 0 changes (to the webui code)

screen shot 2019-01-31 at 9 03 15 am

My peerdium demo also works with in-browser node with 0 changes as well. 🎉

At the moment this version lives in a separate branch because:

Details

Open questions (would love feedback)

Gozala commented 5 years ago

I'm inclined to think that some lightweight version of in-browser node should always be there if not replicating data at least maintain list of relevant CIDs so that in case of native node being down it can still present a user with a consistent library & lazily attempt to fetch relevant data off of network.

Should I be looking at IPFS Cluster stuff for this stuff ?

MidnightLightning commented 5 years ago

@Gozala What to do when user used in-browser first and then native IPFS node. It's possible possibly to replicate data from in-browser node to the native one, but it does not seem trivial to deal with conflicts or timing in case there is a lot of data to be pushed to.

Should I be looking at IPFS Cluster stuff for this stuff?

I found this issue through searching around related to a concept I've been thinking more about. That idea of both an in-browser node and a native/standalone node I think is something that should be fleshed out as a user norm. Using as an example how users interact with services like Dropbox, I have Dropbox clients on my desktop, laptop, and phone, and have different files "starred" for offline use on my phone than I use most frequently on my laptop or desktop. I think it would be ideal that among the standard peer discovery methods that any given IPFS node (in-browser or standalone) has, that it additionally allows a user to indicate another node as "theirs" (add authentication/credentials?), and then those nodes actively sync pins/virtual filesystem structures between them. In that way, I could have a standalone node running on my workstation, and when I open a browser on my workstation, laptop, or phone, they all create an in-browser node, and I end up with four nodes that are all "me" and storing my data. I'd probably want to configure the in-browser node on my workstation to do minimal storage (since there's another node on that machine that should be primary), and would like the control to indicate my workstation node should pin/keep a copy of everything the others pin (primary backup), the laptop should as well, when it's online (secondary backup), and the phone node would only pin important things (space concerns), but being able to browse "known" hashes/files on the workstation/laptop nodes would be ideal.

From that perspective, it would be fine if all in-browser nodes stayed in-browser nodes (no need to "change over" to a standalone node if it came back online), but pinned/known file syncing could be very useful.

mikeal commented 5 years ago

I'm inclined to think that some lightweight version of in-browser node should always be there if not replicating data at least maintain list of relevant CIDs so that in case of native node being down it can still present a user with a consistent library & lazily attempt to fetch relevant data off of network.

One question I have that might bring some clarity to these questions: what is the delta between what we think an ideal “integrated-in-browser-IPFS-node” and a js-ipfs service worker?

The reason I’d like to think about things this way is that this excercise might surface differences between “features missing in the web platform” that we don’t have without native integration and features the platform doesn’t have because of legitimate security and isolation concerns between applications.

The security story for the current locally running server (either Go or IPFS Desktop) is practically non-existent. Having a similarly scoped shared resource will need to drastically improve that security story and it’s not yet clear to me where this is the responsibility of IPFS or if we’re actually missing a feature or integration in the browser.

Gozala commented 5 years ago

One question I have that might bring some clarity to these questions: what is the delta between what we think an ideal “integrated-in-browser-IPFS-node” and a js-ipfs service worker?

I can only speak for myself and what I think & going for is "browser is your IPFS node". JS-IPFS, SW, IPFS-Desktop etc... are just polyfills to deliver / explore that experience.

The reason I’d like to think about things this way is that this excercise might surface differences between “features missing in the web platform” that we don’t have without native integration and features the platform doesn’t have because of legitimate security and isolation concerns between applications.

I think there is general assumption that web platform lacks features to implement full fledged IPFS node in the web content context. I think it's an incorrect way to look at things. Even if browsers exposed all the low level networking primitives (which is highly unlikely) to allow that, each browser tab running own IPFS node would be a terrible experience.

That is to suggest that if / when IPFS is adopted by a browser, browser itself will become IPFS node and expose limited API to access & store content off of network. And yes it will impose same / similar origins separation concerns as it does today.

Goal of this exploration is to polyfill described experience through variety of tools available:

That way it applications:

  1. Can be loaded of IPFS network
  2. Be isolated via origin
  3. Have a way to read / write data (based on the origin)

The security story for the current locally running server (either Go or IPFS Desktop) is practically non-existent.

Yes and that is a huge issue waiting to be exploited. I would absolutely encourage to lock it down. Last time I checked ipfs deamon / gateway comes with default of Access-Control-Allow-Origin: * which means any site could take control of it and exploit it.

Both should be locked down to a one single origin maybe access.ipfs.io or something along those lines. Same is true for the routing services that js-ipfs connects to they should be open only to a single origin controlled by PL.

Having a similarly scoped shared resource will need to drastically improve that security story and it’s not yet clear to me where this is the responsibility of IPFS or if we’re actually missing a feature or integration in the browser.

No features are missing on that end. What this PoC does is uses special origin (in this case lunet.link, but should be access.ipfs.io or something) that provides access to the IPFS network and mediates access control with user. This way it is able to control access rights based on users consent rather than allow each app / site do whatever they want. This also allows user to access library of all data in one place. While that is what browser should be doing, I think it is worth doing it on IPFS end right now to exercise this and learn from experience and also establish a cowpath.

Gozala commented 5 years ago

I have cluster branch now which runs in-browser node and attempts to use local native node through REST API simultaneously. At the moment it's pretty dumb it just forwards requests to both nodes and attempts to serve response from native node with fallback to in-browser node.

At the moment there is no attempt to sync two, for that it would probably make most sense to borrow the logic from ipfs-cluster rather than trying to hack things together.

I'll focus on getting this working in Safari through SW poly-fill for SharedWorker and then deploy current version.

Gozala commented 5 years ago

Responding to @MidnightLightning

Using as an example how users interact with services like Dropbox, I have Dropbox clients on my desktop, laptop, and phone, and have different files "starred" for offline use on my phone than I use most frequently on my laptop or desktop.

That is good point! However case with native node vs in-browser node is different as from user point of view it's the same device.

I think it would be ideal that among the standard peer discovery methods that any given IPFS node (in-browser or standalone) has, that it additionally allows a user to indicate another node as "theirs" (add authentication/credentials?), and then those nodes actively sync pins/virtual filesystem structures between them.

I have being thinking about this in a slightly different way. I imagine library organized as "collections" (or threads in Textile terms). Ideal is that you can invite others to collaborate on those collections. Those others can be your other devices, your friends or pinning services.

In that way, I could have a standalone node running on my workstation, and when I open a browser on my workstation, laptop, or phone, they all create an in-browser node, and I end up with four nodes that are all "me" and storing my data. I'd probably want to configure the in-browser node on my workstation to do minimal storage (since there's another node on that machine that should be primary), and would like the control to indicate my workstation node should pin/keep a copy of everything the others pin (primary backup), the laptop should as well, when it's online (secondary backup), and the phone node would only pin important things (space concerns), but being able to browse "known" hashes/files on the workstation/laptop nodes would be ideal.

I think all those use cases fit nicely with an above described solution, further more it follows the interaction flow. User during sharing / publishing phase chooses who to share it with.

Implementation vise it seems that "collection" should just be a "ipfs-cluster".

lidel commented 5 years ago

@Gozala I like the idea of a seamless/self-healing abstraction for browser context (Access Point Facade), but figuring out how to handle the surface of IPFS API when API provider is a facade on top of multiple nodes going online and offline will be a challenge.

Agree we should look at ipfs-cluster for inspiration, but security considerations will not fit exactly. eg. in browser we want to build security perimeter around Origin and based on it introduce key/mfs write/read scoping/sandboxing, limit access to sensitive endpoints such as ipfs.config etc

Seems that the MVP would need:

Gozala commented 5 years ago

@Gozala I like the idea of a seamless/self-healing abstraction for browser context (Access Point Facade), but figuring out how to handle the surface of IPFS API when API provider is a facade on top of multiple nodes going online and offline will be a challenge.

I agree. I am also getting more and more convinced that exposing full IPFS API may not be a good idea in first place. While it is cool to have webui running over this, I think it's a wrong abstraction for most apps.

Agree we should look at ipfs-cluster for inspiration, but security considerations will not fit exactly.

I need to write a coherent story about the experience I have in mind, but before I'll get around to doing it here is a gist:

eg. in browser we want to build security perimeter around Origin and based on it introduce key/mfs write/read scoping/sandboxing, limit access to sensitive endpoints such as ipfs.config etc

Absolutely! However I think that should happen at the lunet (Access Point Facade) level before any calls are issued to any of the IPFS nodes.

On the sandboxing I'm still working out some details in my head but I think there is a real opportunity to improve on the mess we're in the conventional web by limiting read / write access to only a app resources / document being operated on. Largest issues on the web are due to third parties doing tracking and aggregating user data on some servers. I think it would be really great if we enforced a setup as something like ://${app}/${data} where app maps to some CID and there for app is able to read / load stuff from it. And ${data} is MFS entry that app is able both read / write to.

In this setup app can't really spy on user, sure it can save some data however that data is local and user personally needs to choose to share it and even then app isn't really able to let own server know where to grab it from.

There are things to be worked out but I'm inclined to think that combination of SW & sandbox-ed iframes might allow for such sandboxing.

Gozala commented 5 years ago

I got it finally working in Safari 🥳 (Debugging SW in Safari is quite a throwback to old days of JS with no debuggers, except there’s no alert or reliable way to print output either 😂)

Now peerdium fork loads with no changes, using in-browser node running SharedWorker polyfill using ServiceWorker.

Issues

However aome content, like posts created by me seem to fail to load, specifically js-ipfs call ipfs.get(cid) returns promise that never resolves nor fails. I also do corresponding request to a (gateway?) server which succeeds with 200 but still nothing from js-ipfs and given no reliable debugging in SW on Safari I was unable to track down what’s the issue. Also same code with a same codepath works as expected in Firefox and Chrome so 🤷‍♂️

I’ll do more digging tomorrow, but thought I’d post in case this is known issue

Gozala commented 5 years ago

Safari also seems to reject POST requests with form data as body. Edit: It also appears that FormData isn't available is ServiceWorker context in Safari, which might be a an underlying reason.

Gozala commented 5 years ago

It seems that in Safari call to resolver.cid(ipfsNode, "/ipfs/QmedJqYfddTzygxpcDrtTJBupHn4qGHntPXBx8APNM5gE1") never resolves or fails which is from ipfs-http-response library, and specifically:

https://github.com/ipfs/js-ipfs-http-response/blob/7746dab433e3a57652e3222bb1cc6051a09576be/src/index.js#L63

Gozala commented 5 years ago

I am exploring attentive approach for loading this described here https://github.com/Gozala/lunet/issues/2#issuecomment-463786508

Winterhuman commented 1 year ago

Just to partly revive the discussion, https://datatracker.ietf.org/doc/draft-ietf-dnsop-alt-tld seems like an interesting idea to remember. It reserves .alt as a non-DNS namespace which applications can reserve for themselves, like CID.ipfs.alt leading to a local IPFS node if desired (and it says the names don't need to be DNS compliant!)