ipfs / notes

IPFS Collaborative Notebook for Research
MIT License
402 stars 31 forks source link

IPFS links on HTTP #326

Open jbenet opened 9 years ago

jbenet commented 9 years ago

IPFS links on HTTP

(( Many people have asked me about this recently, so writing it up to point people directly here )).

One of the important parts of IPFS adoption will be interfacing cleanly and easily with existing websites on HTTP. The goals here are to:

I've been describing a set of layered tools:

  1. HTTP/IPFS Gateways
  2. ipfs.js - full ipfs node on the browser
  3. browser extensions - shared storage
  4. same node as daemon service in the OS
  5. modified browser

(and more ancillary small tools)

1. Anycasted HTTP/IPFS Gateways

The plan is to run a set of servers accessible via ipfs.io or raw.ipfs.io which act as HTTP <--> IPFS gateways. These servers would resolve HTTP URLS like:

http://raw.ipfs.io/<ipfs-path>

The servers would be full IPFS nodes thay can resolve <ipfs-path>, retrieve the object, and return it. At least for /ipfs/* paths (not /ipns/* paths) servers could be run by anyone, as the clients can retrieve all hashes and integrity-check the returned blocks.

This would make it possible to start using IPFS directly on the web today, with absolutely no extra code running anywhere but our servers.

Note: actual URLs not clear yet, as we'll probably also expose the (currently developing) ipfs node http api, with paths like:

# return (serialized) raw block
http://ipfs.io/api/<api-version>/block/<ipfs-path>

# return only links table
http://ipfs.io/api/<api-version>/links/<ipfs-path>

# return only data portion
http://ipfs.io/api/<api-version>/data/<ipfs-path>

# return full unixfs file
http://ipfs.io/api/<api-version>/unix/cat/<ipfs-path>

(these are not at all final, just a preview).

2. ipfs.js - full ipfs node on the browser

The second layer is a full javascript (node) implementation of IPFS, so that it can run in any web browser. The main p2p transport would be WebRTC (though could still use websockets, and potentially even fallback to (ugly) polling or layer 1 described above).

In addition to a full node, we can offer a small library that rewrites objects (as they are being rendered) with ipfs links (/ip{f,n}s/<hash>/<path>) as src attributes, to load via IPFS. PeerCDN and other tools have demonstrated this p2p-browsing to work very well.

3. browser extensions - shared storage

This is the same as Layer 2, but packaged as an extension for browsers, so that all tabs can share the same node (and object cache).

4. same node as daemon service in the OS

If the user has a local IPFS service running, browser tools can use it (either by IPC, or using an HTTP interface exposed by the daemon) to share the object cache. This will be particularly useful once lots, and lots of content starts to be shared via IPFS. The web will feel significantly faster if 90% of the static objects are loaded locally (or even regionally).

5. modified browser

We'll eventually submit browser patches to chromium and firefox suggesting the adoption of ipfs as a transport (don't hold your breath).

It's also possible to fork C/FF to provide a working browser with IPFS support (I've always wanted to implement a few browser ideas) but that probably won't happen.

mildred commented 9 years ago

The servers would be full IPFS nodes thay can resolve , retrieve the object, and return it. At least for /ipfs/* paths (not /ipns/* paths) servers could be run by anyone, as the clients can retrieve all hashes and integrity-check the returned blocks.

Why not allow the HTTP server to resolve/modify the ipns namespace ? Is there a specific reason ?

cryptix commented 9 years ago

I'd like to 2nd the question. The only real reason I can think of is that you need the private key to publish something and than you could connect to the network directly..

But I could see a case where you don't have unrestricted access (firewall etc) to the network.

How about a fallback where you use another node as proxy to publish something? You could verify your identity by self-signing your request as a JSON Web Token for example.. But maybe I'm missing something..

jbenet commented 9 years ago

Why not allow the HTTP server to resolve/modify the ipns namespace ? Is there a specific reason ?

This can certainly be done, the reason i made a distinction is that a malicious HTTP gateway could serve old entries and hide new ones. One would need to query "multiple independent HTTP gateways" to get better certainty of freshness (which is doable but gets complicated to get right, hence my deferring to the future). In the proper ipfs Routing system this attack is also possible, but since queries are being resolved over a large network clients can collect responses from multiple peers before settling on the value of an entry (in a DHT, this is similar to the s/kademlia multi path query result).

lidel commented 9 years ago

To address some of use cases in current browsers:

  • provide IPFS as a data transport for regular browsers
  • enable website developers to use IPFS to stream their content
  • do so with the least user friction (LUF!) possible

in https://github.com/lidel/ipfs-firefox-addon/issues/16#issuecomment-91795879 I started a discussion about possible conventions for browser add-ons to automatically reroute some specific HTTP GETs to a local gateway.

traverseda commented 8 years ago

That seems like a very complicated stack. I think IPFS in the browser has a compelling enough use-case that people will install third party software to use it.

This multi-tier approach would require a lot of dev-time, and generally increases the complexity of the whole thing. Are my users using the web-RTC version? Is my main content mirror serving on web-RTC? Etc.

Requiring that people install software to use IPFS has some other benefits, IPFS is at it's best when more people are willing to serve content. If you were writing a web app, you'd want to set up a primary replica, something that's guaranteed to be up, but you also want your users mirroring content.

It seems much simple to have a standardized JS library that first checks localhost, then falls back to a public or paid-for service. I'm a bit unclear as to where a webRTC/node implementation comes into it.

Fil commented 8 years ago

Maybe RFC 7838 can help us with this.

The Alt-Svc HTTP Header Field An HTTP(S) origin server can advertise the availability of alternative services to clients by adding an Alt-Svc header field to responses.