ipfs / specs

Technical specifications for the IPFS protocol stack
https://specs.ipfs.tech
1.15k stars 232 forks source link

Easy-to-implement gateway-less IPFS URLs #192

Open anguslees opened 5 years ago

anguslees commented 5 years ago

(I am new to IPFS contribution and have no idea which repo/forum I should use for this. Please move as appropriate.)

Problem: I want to point all my tools/configs at content stored in IPFS because IPFS is awesome. It is not feasible (now, or perhaps ever) to add a full dhtclient node implementation to every tool, particularly across every programming language.

Non-solution: I can use HTTP gateway URLs, but then my configs are not portable across sites. I want to use my gateway at my site (because caching is important for large content), but that gateway host doesn't make sense for other sites.

Implication: I think we need to start using "real" (gateway-less) IPFS URLs (ipfs:/...) now for portability, and invent a trivial-to-implement "client" that can fetch IPFS content using libraries that are already commonly available in programming languages. Importantly, it should be feasible to propose adding this trivial-to-implement client to every tool that wants to fetch content.

Note that once we have this, we can evolve the actual client implementations over time without changing the high-level IPFS semantics exposed to users.

Proposal: I think the only "trivial" solution available right now is converting IPFS URLs to HTTP(S) gateway URLs, and then reuse existing ubiquitous HTTP(S) libraries. I propose we define a semi-standard way to "discover" the local IPFS gateway, and then implementations can just do a trivial URL string manipulation to switch ipfs:/path to $gateway/path (with some protection against "../" attacks).

Concrete proposal:

  1. Declare a semi-standard IPFS_GATEWAY environment variable to specify the HTTP(S) URL prefix.
  2. Fallback to a _ipfs-gateway._tcp DNS-SD TXT lookup for easy site-wide configuration (controversial).
  3. Fallback to https://ipfs.io/ (controversial?).
  4. Fetch content over regular HTTP(S) from the gateway. In particular, we're trusting the discovery result, gateway implementation, and gateway<->client communication to offload content/checksum verification.
  5. Start proposing patches to various non-IPFS projects that take advantage of the above to add support for ipfs:/ URLs.
  6. Profit. (Actually, continue to iterate on the internals of the above when/if "more native" approaches become available - eg long-lived golang programs could embed a full dhtclient node)

At the moment I have only considered read-only workloads (this is the vast majority of my personal use-cases), but the above should extend to write as well - assuming the gateway allows uploads.

Thoughts? I have a specific tool I would like to add this to, and was hoping to get some consensus on the approach before starting on code.

Stebalien commented 5 years ago

I think this comes down more to "light" clients than "URLs". As you've noticed, that's a tricky problem.

IPFS content using libraries that are already commonly available in programming languages.

Unfortunately, the current transports/protocols are mostly too complicated to "easily" implement them. Even parsing ipfs datastructures is complicated. We could try leveraging something like webasm but that's not widely supported yet.

Really, the end-goal is to get everyone to run a single, shared ipfs daemon on every machine. To get there, we're going to need to make it possible to run a vary light-weight client.

Part of this is something we call "delegated routing". That is, instead of using the DHT yourself, you use some centralized service and let it find content for you. This feature is primarily targeted at mobile and web users.

I think the only "trivial" solution available right now is converting IPFS URLs to HTTP(S) gateway URLs, and then reuse existing ubiquitous HTTP(S) libraries

FYI, the "IPFS browser companion" will do this for you. That is, it can handle ipfs://Qm... and convert that to a gateway URL. It even has local gateway detection.

Concrete proposal:

  1. An IPFS_GATEWAY option sounds reasonable. We currently use an IPFS_API variable in several of our projects to achieve the same goal.
  2. Using DNS-SD has some obvious security problems. Unfortunately, given the complexity of the ipfs datastructures, even verifying a response from a gateway can be tricky.
  3. Falling back on ipfs.io sounds fine in most cases. We'd probably want to pin the certificates.
anguslees commented 5 years ago

Really, the end-goal is to get everyone to run a single, shared ipfs daemon on every machine. To get there, we're going to need to make it possible to run a vary light-weight client.

Agreed. I want to be clear that the proposal above is an interim measure - something we can do right now, and yet presents a UX that we can still carry forward into that future end-goal.

I think that future end-goal still looks to the user like:

Maybe we do content checksum verification in-process, or we trust the local ipfs daemon to do that for us, but this will be mostly hidden from the user (it affects the protocol used between the two however).

So .. is this agreement (with the DNS-SD part removed)? Is it reasonable to go around adding the above to various tools/languages now, or do we need further discussion or a more formal proposal?