The problem of browser support

Mass adoption of emerging protocols for the decentralized web such as IPFS are limited by the lack of browser support. Full support via browser extensions are not accessible to non-technical users, and partial support via public gateway cannot utilize the advantages of a decentralized network.

With thicker clients, thicker protocols and thinner or no servers, the decentralized web is in many ways similar to applications installed locally in a connected operating system, instead of webpages retrieved from servers.

However, current browsers evolved around the server-client mental model. To fully support the decentralized web, they not only need to register new protocol handlers, but also need to provide new mental models for user interaction and applications.

Current state of accessing web apps on IPFS

Assume we have a minimal peer to peer web app: a HTML/CSS/JavaScript bundle stored on IPFS and uses API from window.ipfs to get and send data. To access it, an user needs to run an IPFS daemon locally, install IPFS Companion browser extension, and finally type in localhost with the correct port number and hash of the web app.

If the web app relies on other emerging protocols such as Ethereum or Polkadot, the user needs to install additional browser extensions to provide computational resource and inject API into DOM. This overly complicated user journey exposes mismatches between the “local first” logic of the decentralized web and the “fetch from server” paradigm of the browser and indicates the need of a redesign.

If we can add native protocol support to browsers or implement them in an integrated runtime environment such as WebAssembly, we can deliver new user interfaces and enable new mental models for managing web apps via browser extension. Before that, the fast way to iterate, explore and demonstrate new mental models would be prototyping a desktop browser.

Design challenges for P2P browser

Regardless of how the user interface is delivered, the following are some features that are unique to decentralized web. Using IPFS as an example, we will focus on a P2P network since it is the most decentralized.

Applications can be installed and named locally

Comparing to web pages, P2P applications are often stored locally. An user should be able to install, browse, remove and rename applications, similar with desktop UI of an operating system. Applications names, icons, developers, permissions etc. can be specified with web app manifests, similar with that of Progressive Web App of browser extension.

The ability of renaming an application locally is crucial, since human readability are often sacrificed in a distributed and secure system. Without a centralized or blockchain-based DNS system, links to web apps are unreadable IPFS/IPNS hashes, and an user should be able to maintain a local human-readable name space to avoid using hashes directly and resolve name collisions.

Managing access to protocol APIs and identify information

Innovations in network protocol and network structure are likely to continue, so the browser needs a predictable way of exposing protocol APIs to applications. IPFS companion and MetaMask have been exposing the API they support under window object and many applications have been utilizing this pattern.

The browser should inherit this pattern for all supported protocols and attach the corresponding APIs under window object. For IPFS, the browser should request authorization from the user before an application pins IPFS hashes for the first time, record all hashes pinned, and unpin them upon application removal.

Identities in P2P web are implemented with public and private key pairs which needs to be protected and stored locally. If the implementation of user identity is left to applications, the browser should use separate identity such as IPFS node id for each application to prevent malicious application from forging identities. If user identity is implemented at browser level, the user should be able to choose which identity they allow the application to use.

Trust Model for Applications

Instead of relying on certificate authorities, the trust from the user to the application relies directly on the checksum, as in the case of IPFS, or on the publisher identity, as in the case of IPNS and Dat. These assumptions should be make explicit when the user visit or install an application.

If user identity is implemented at browser level, the browser should allow the user to authorize an identity to a given application. The browser can potentially keep a contact list that the user trust, as well as the trusted contact lists of the trustees, and implement a PGP-like scheme to help user assess the trustworthiness of an unknown application publisher by degrees of separation in a web of trust.

Proof of concept

The above are just some examples of design challenges for P2P browsers, and further discussions and experimentations are needed.

To explore and demonstrate how users should interact with P2P web app, we can start with a bare bone desktop browser that only provides computational resources for IPFS and rendering environment for applications. Everything else, from user identity to resource management, is left to applications.

The following specification of a minimal viable product should be able to support new mental models of a P2P web:

Use Electron.js to render web apps :
- Electron main process starts and controls local go-ipfs daemon with js-ipfsd-ctl.
- Web app bundles are delivered by IPFS via local HTTP gateway in subdomain style.
- Electron main process starts new BrowserWindow and inject IPFS API into window object to render web app.
Provides user interface to render, installed, renamed and removed web apps:
- User can input a IPFS/IPNS/DNSLink string to view and install a web app or web page.
- User can view installed web apps in grid view or list view, and can rename or remove them.
Supports standard HTML/CSS/JavaScript bundle as web app:
- Web app uses HTML file as markup and JavaScript code to call API and update application state, as in JAMstack. API calls can be via HTTP or window.ipfs.
- Developer should be able to build web app using React.js, Vue.js or any other standard web site building tool, and publish simply with ipfs add -r.
- Application information is declared via web app manifest, and the browser can create a default manifest if one does not exist.
Web apps can be pre-installed during build time:
- Developers can build different releases with different pre-installed applications, which can include different interfaces for system settings and resource managing.
- Example web apps can include matters.news for social networking and ipfs-webui for monitoring local resource.

I have a few comments and questions as follows:

I like the "local first" principle for p2p applications that application should store, manage and utilize resources(web app, application runtime, data) locally as much as possible.
Does install web app locally means to download the web app bundle to local storage or local IPFS daemon the first time one access the app then runs locally afterwards? Does the web app also runs the back-end computation locally?
I agrees user ID should be implemented at browser level, which can be used across different web apps. How to implemented it in a decentralized way?

Does install web app locally means to download the web app bundle to local storage or local IPFS daemon the first time one access the app then runs locally afterwards?

@rairyx I think the browser should allow the user to view a web app by IPFS hash or DNSLink, and also allow the user to install it. If the user view the web app for the first time, the bundle will be downloaded by IPFS daemon. Without installation the bundle will be removed via garbage collection, with installation the bundle will be pinned by IPFS daemon.

Does the web app also runs the back-end computation locally?

The web app can use IPFS API to perform CRUD operations, and the corresponding computation is done locally by IPFS daemon. The web app should also be allowed to communicate with remote APIs if other backend services is needed.

I agrees user ID should be implemented at browser level, which can be used across different web apps. How to implemented it in a decentralized way?

I think this is a complicate topic and requires more research and exploration. Browser can potentially provide ID related information for all protocols supported (such as IPFS nodeId, BitCoin address and private key, etc.), and maintain trust levels for know public keys to help authenticating unknown web app publishers (similar with PGP/GnuPG).

And one approach might be integrating existing solutions such as Nomios. For the proof of concept I think it's sufficient to leave it to the applications.

Some thoughts:

On BrowserWindow and Electron

:thought_balloon: This enables use of native ipfs:// and ipns:// URIs for addressing IPFS content, which is nice.
- We experimented with that approach in Muon-based Brave (https://github.com/brave/muon/pull/507), but upstream Electron supports registerStreamProtocol it natively, so you should be able to do streaming and range-requests without additional work.

On window.ipfs and access controls to the API

"managing access to protocol APIs and identity information" part will be tricky, but also exciting to see. Very happy about experimentation happening in this space.

Perhaps useful context: IPFS Companion was injecting IPFS JS API on pages as-is under window.ipfs, but that proved to be highly problematic when done on regular web, and we stopped doing that.

Food for thought:

API injection opens surface for user fingerprinting
- your proposal mitigates this as entire browser will be focused on p2p, but mentioning it here for completeness
ipfs-provider exists to simplify creation of IPFS API fallback logic
- it is enabling web developer to use HTTP API running somewhere on the backend with fallback to spawning embedded js-ipfs on-page
- in the past it also supported use of window.ipfs, when present, so one could try user's local node first, then fallback to remote one or js-ipfs
- caveat: this library assumes access to entire API is okay (for example, when remote API is used, it is up to developer to expose only a subset of commands via reverse proxy)
Be cognizant that injecting the entire IPFS Core API as-is grants the same admin-level access to all data and gives unlimited configuration control over user's node
- as a mitigation, the window.ipfs experiment we had in IPFS Companion exposed only a subset of APIs for adding and getting standalone data or via MFS, and each web app was scoped to own MFS root
- :thought_balloon: having a p2p browser with go-ipfs being shared among multiple dapps over window.ipfs-like interface will face the same problem
Core JS API had huge breaking changes earlier this year (blog), which effectively broke every app that relied on window.ipfs injected by IPFS Companion (we simply disabled the experiment due to this)
- this is a strong signal that injecting as-is is good for PoC, but if we want something to be a part of the browser itself, perhaps there should be a stable API dedicated for web apps that does not break over time (could be small, immutable api, or there could be versioning and a clear deprecation plan)
- :thought_balloon: I believe this problem space will be interesting to @gozala, who did similar things for Dat in the past (https://github.com/beakerbrowser/specs/pull/21/) and recently started looking into lower level libraries for use on the web (shared worker, js-dag-service etc)
Somehow related: IPFS HTTP API is lacking context-agnostic access controls (https://github.com/ipfs/go-ipfs/issues/1532)
- In the browser context API is guarded by CORS + hardened Origin check
- Safelisting via CORS to grant access to specific Origin is tedious, and requires manually changing node's configuration
- :thought_balloon: modeling user interaction for granting access is more user-friendly way will be very useful and could inform what type of access controls are added to the API

Thank you for your comment @lidel ! A lot of valuable information. Here are some of my thoughts and questions:

What API to expose
- Injecting the entire IPFS core API as-is is indeed dangerous, and I think the browser should allow the user to choose different permissions for an application. For the PoC I think it's still useful to be able to authorize the entire core API, so that application developers can use abstractions built on top of IPFS (for example, at Matters we are planning to use OrbitDB).
- A stable API dedicated for web apps that does not break over time would be very ideal. However, similar to above, since there is already an ecosystem on top of IPFS, we would want developers to be able to utilize existing tools. It is hard to define a stable API (more stable than IPFS itself) without a consensus on what tools and functionalities developers need for a production ready p2p web app.
- Scoping each web app to its own MFS root sounds like a great idea, are there more informations on how this was designed and implemented?
How to expose the API
- One reason I think window.ipfs would be a good way to expose the API is that IPFS Companion was injecting API this way, so a p2p web app can be run in this browser or other browsers with IPFS Companion installed. Since IPFS Companion has stopped injecting API this way, are there alternative methods been proposed? It will be hard for developers to write web apps that can only be run in one experimental browser.
- You mentioned but that injecting API under window.ipfs proved to be highly problematic when done on regular web, can you explain more on what those problems are? For example, if we put the HTTP host address under window.ipfsHttpAddr which can be used by applications directly, do similar problems exists?
- Dat Service for Beaker Browser proposed by @Gozala looks amazing. For PoC, an easier path should be defining the API used by the application and providing it by the browser itself, so that the implementation is not restricted to JavaScript. Curious to learn different ways of allowing users to bring their own services/API.

Scoping each web app to its own MFS root sounds like a great idea, are there more informations on how this was designed and implemented?

It was implemented in userland, as an experiment in browser extension. Before passing arguments to the IPFS API JS code modified them to ensure App can "see" only own MFS root, which in reality was a directory in /dapps/<hostname>/.

The problem with scoping/sandboxing in userland is that it is effectively a denylist instead of safelist, and this type of approach to security is asking for trouble: a bug or API change can silently compromise this type of sandboxing. It happened before.

Without upstream support for scoped MFS this approach is too risky and should not be a foundation for building stuff. Right now it is better to be honest and just ask user if they grant full admin access to specific commands without trying to sandbox them.

since IPFS Companion has stopped injecting API this way, are there alternative methods been proposed? A stable API dedicated for web apps that does not break over time would be very ideal.

@Gozala is looking into this problem space in js-dag-service project. See "why?" here and API design discussion in https://github.com/ipfs/js-dag-service/issues/338.

You mentioned but that injecting API under window.ipfs proved to be highly problematic when done on regular web, can you explain more on what those problems are?

Injecting big JS payload on every page degraded browsing performance (we made some optimizations, but still..)
Introducing new window attribute provides additional bit that can be used for fingerprinting and tracking users
True deal-breaker was that injected API was not stable API to be exposed on the web. js-ipfs-http-client follows semver, windows.ipfs did not. Once in a while the api client has a major release with breaking changes. When ipfs-companion updates its js-ipfs-http-client dependency, it may break apps that relied on window.ipfs to act a certain way that changed in the latest version (for example, when it switched to async iterators, and later when ipfs.add was split into add and addAll)

For example, if we put the HTTP host address under window.ipfsHttpAddr which can be used by applications directly, do similar problems exists?

This is a much safer approach than injecting JS API under window.ipfs, because it is then up to website developer to decide which version of js-ipfs-http-client is used, and when a breaking change occurs to the upstream JS library, they are in control and can migrate when they want. No app breakage due to unexpected API change.

If we think about this from the perspective of a custom p2p browser, the UX could be quite nice. Instead of putting the true API endpoint there, your p2p browser could act as a proxy to it, exposing own port that acts as a middleware. This approach would have some interesting properties:

p2p browser could display native UI for granting access to a subset of the API (eg. deny access to changing config or reading full MFS)
proper CORS headers could be set by this middleware API proxy to grant access to specific Origin safelisted by user. This would remove the need for user to manually modify their config.
if user has no go-ipfs running, you can spawn it on the first use (ipfsd-ctl tool is handy for daemon orchestration)

Right now it is better to be honest and just ask user if they grant full admin access to specific commands without trying to sandbox them.

Yes I agree, if we cannot guarantee security yet, we can first make the risk transparent 😜

True deal-breaker was that injected API was not stable API to be exposed on the web. js-ipfs-http-client follows semver, windows.ipfs did not. Once in a while the api client has a major release with breaking changes.

This is a great point. Exposing HTTP endpoint is also generalizable, and we can add support for other protocols in similar fashion if needed, e.g. libp2p, Dat, Ethereum, SSB.

If we think about this from the perspective of a custom p2p browser, the UX could be quite nice. Instead of putting the true API endpoint there, your p2p browser could act as a proxy to it, exposing own port that acts as a middleware.

Yes indeed. I think the browser should act as an access control layer for underlying protocols, which can include identity related protocol and data, and otherwise provide protocols as-is. The corresponding UX could feel very similar to mobile devices, on which applications are installed and can request access to different type of resources.

hypha-network / hypha-desktop

Redesigning browser for the decentralized web #15