ipfs / kubo

An IPFS implementation in Go
https://docs.ipfs.tech/how-to/command-line-quick-start/
Other
15.83k stars 2.96k forks source link

Standard URI for ipfs and ipns protocols (Discussion) #1678

Closed larsks closed 5 years ago

larsks commented 8 years ago

I would like to add ipfs support to a tool that expects a URL-format specification. Hypothetically, let's say I wanted to add ipfs suport to curl. I would need a scheme:data format specification that follows the standard url format.

I asked about this on irc and immediately folks started trying to direct me away from URLs to the multiaddr spec. Setting aside for the moment then I'm not clear what problem multiaddr is trying to solve or why URLs aren't appropriate, some tools will simply require a URL format to operate.

In the absence of any other suggestions, I would like to suggest that we document the following standard forms:

willglynn commented 8 years ago

I strongly agree that IPFS objects should be identifiable by URI, mostly because of uniformity as described by RFC 3986:

  Uniformity provides several benefits.  It allows different types
  of resource identifiers to be used in the same context, even when
  the mechanisms used to access those resources may differ.  It
  allows uniform semantic interpretation of common syntactic
  conventions across different types of resource identifiers.  It
  allows introduction of new types of resource identifiers without
  interfering with the way that existing identifiers are used.  It
  allows the identifiers to be reused in many different contexts,
  thus permitting new applications or protocols to leverage a pre-
  existing, large, and widely used set of resource identifiers.

I don't care if go-ipfs uses URIs internally, or if browsers will support it, or anything like that – there should be a canonical, standard way to refer to IPFS objects using URIs.

The above suggestion seems entirely reasonable to me:

ipfs:QmPXME1oRtoT627YKaDPDQ3PwA8tdP9rWuAAweLzqSwAWT/readme

I could also see an argument for IPFS identifiers being URNs per RFC 2141:

urn:ipfs:QmPXME1oRtoT627YKaDPDQ3PwA8tdP9rWuAAweLzqSwAWT/readme

IPNS is separate. I could see an argument for it being a separate scheme, as proposed above:

ipns:QmXfrS3pHerg44zzK6QKQj6JDk8H6cMtQS7pdXbohwNQfK/pages/gpg.md

On the other hand, I could see the IPNS public key hash being a naming authority per RFC 3986.

 Many URI schemes include a hierarchical element for a naming
 authority so that governance of the name space defined by the
 remainder of the URI is delegated to that authority (which may, in
 turn, delegate it further).  The generic syntax provides a common
 means for distinguishing an authority based on a registered name or
 server address, along with optional port and user information.

Authorities are preceded by a double slash, so:

ipns://QmXfrS3pHerg44zzK6QKQj6JDk8H6cMtQS7pdXbohwNQfK/pages/gpg.md

…and if IPNS public key hashes are interpreted as an authority, distinct from the global (no-authority) IPFS paths, IPFS and IPNS could be viewed as two halves of one scheme:

ipfs:QmPXME1oRtoT627YKaDPDQ3PwA8tdP9rWuAAweLzqSwAWT/readme
ipfs://QmXfrS3pHerg44zzK6QKQj6JDk8H6cMtQS7pdXbohwNQfK/pages/gpg.md
larsks commented 8 years ago

@willglynn, thanks for your comments.

While working with this in practice, I realized that one may want to provide IPFS gateway information as part of the URL. Again, using the hypothetical example of adding IPFS support to something like curl, I need a way to tell the utility which IPFS endpoint to use. If I'm not running one locally, the utility needs to know where to find an API to fetch ipfs:QmPXME1oRtoT627YKaDPDQ3PwA8tdP9rWuAAweLzqSwAWT/readme.

Should this information always be specified external to the URL (e.g., configuration options to the tool)? Or does permitting this in the URL make sense? That would give us something like:

ipfs://<host>:<port>/QmPXME1oRtoT627YKaDPDQ3PwA8tdP9rWuAAweLzqSwAWT/readme

Where <host> and <port> specify an endpoint, and can be omitted, making the typical URL look like:

ipfs:///QmPXME1oRtoT627YKaDPDQ3PwA8tdP9rWuAAweLzqSwAWT/readme
willglynn commented 8 years ago

In my view, IPFS gateway information is external to the URI. I think it's analogous to HTTP/FTP/SOCKS proxy information for which tools like curl are configured using separate parameters or environment variables. An IPFS object should always have the same identifier regardless of how it is accessed, and especially in the case of IPFS, I think identity is more fundamental than location.

larsks commented 8 years ago

Yeah, that was mostly my inclination as well. Externally configured it is, then.

willglynn commented 8 years ago

The more I think about it, the more I favor treating IPNS names as a URI authority. Consider:

ip[nf]s://multihash/object

Retrieval would be processed as "resolve IPNS multihash, retrieve object".

http://domain.name/object

Retrieval would be processed as "DNS look up domain.name IN A, connect to resulting IP, HTTP GET /object".

ip[nf]s://<domain.name>/object

Retrieval would be processed as "DNS look up domain.name IN TXT, resolve resulting IPFS/IPNS identifier, retrieve object".

lidel commented 8 years ago

I feel this quote belongs here :-)

I want to remind everyone here that we're not actually limited by any rules. It is of course convenient and nice to work productively with everybody else, but there are certain mistakes we should not continue to make.

I'll give you an example of a break from tradition (or rather... a return to even older tradition). it is a strong goal to mend the rift between UNIX and the Web. That is, "ipfs links" should be exactly the same in both the Web, and UNIX. meaning: /ipfs/<hash>/<path>, NOT ipfs://<hash>/<path> (explicitly disobeying the scheme that the W3C insists on). Luckily for us, this is technically feasible, though it does have its bumps to work around. (and easiest done with a TLD :) ). That's fine for us as the upside of mending part of this awful rift is worth a lot.

– @jbenet in https://github.com/lidel/ipfs-firefox-addon/issues/16#issuecomment-92343901

So the canonical way of representing IPFS address is /ipfs/<hash>/<path> (at least that was the consensus in April, maybe it changed since then? :).

That being said, IMHO it would not hurt to additionally provide a silent support of URLs like ipfs://<hash>/<path> for interoperability with legacy software that can't use /ipfs/<hash>/<path> natively.

whyrusleeping commented 8 years ago

@lidel nothing has changed since that quote :)

willglynn commented 8 years ago

Indeed; IPFS can choose whatever syntax it wants for canonical use. I'm suggesting that there be an official, standardized way of encoding IPFS identifiers into URIs, since it is better to interoperate with URI-centric tools than not, and since it's better to have one way of encoding IPFS identifiers into URIs than multiple incompatible methods.

I would favor ipfs:/ipfs/<hash>/<path> over ipfs://. RFC 3986's URI grammar would permit ipfs:object, ipfs:/object, and ipfs://object, but the first two constructs are interpreted as paths, while the third would be interpreted as an authority with no path. Per RFC 2718:

2.1.2 Improper use of "//" following "<scheme>:"

   Contrary to some examples set in past years, the use of double
   slashes as the first component of the <scheme-specific-part> of a URL
   is not simply an artistic indicator that what follows is a URL:
   Double slashes are used ONLY when the syntax of the URL's <scheme-
   specific-part> contains a hierarchical structure as described in RFC
   2396.  In URLs from such schemes, the use of double slashes indicates
   that what follows is the top hierarchical element for a naming
   authority.  (See section 3 of RFC 2396 for more details.)  URL
   schemes which do not contain a conformant hierarchical structure in
   their <scheme-specific-part> should not use double slashes following
   the "<scheme>:" string.

If the IPFS namespace is one big path hierarchy, then mapping IPFS / to URI ipfs:/ seems appropriate, and conversion to/from URIs is just* a matter of a five-character prefix.

lidel commented 8 years ago

Ok, so as long as this discussion is about interoperability layer, I like the idea of going with a single prefix for all IPFS resources (dropping ipns://).

Starting with <hash> IPFS is hierarchical, so perhaps we should go with

Additionally we could agree to default to ipfs resource if the first segment does not match ipfs nor ipns:

What are your thoughts on this?

willglynn commented 8 years ago

A goal from @jbenet's post:

please don't do this. please please please have identifiers exactly how we have them, everywhere. Simply /ipfs/... and /ipns/....

If this is the objective, then the corresponding URIs should be ipfs:/ipfs/… and ipfs:/ipns/…. URI-centric tools would behave properly with both absolute and relative paths of the above canonical (scheme-less) IPFS form. A resource retrieved from the URI ipfs:/ipfs/hash/A referring to an absolute /ipfs/hash/B would be understood as ipfs:/ipfs/hash/B, which is consistent. Such a resource referring to a relative B would be understood to be ipfs:/ipfs/hash/B, which is also consistent.

Notably, these same references – /ipfs/hash/B and B – would work unchanged whether the underlying resource appears at the filesystem's /ipfs/hash/A, at http://gateway.ipfs.io/ipfs/hash/A, at http://localhost:8080/ipfs/hash/A, or in URI-space at ipfs:/ipfs/hash/A.

Using double-slashes (//) would sacrifice this property, since a resource at ipfs://hash/A referring to /ipfs/hash/B would resolve to ipfs://hash/ipfs/hash/B.

lidel commented 8 years ago

Makes sense. This way a rule for legacy layer would be super simple: always add ipfs: as a prefix to the canonical name. That is all.

willglynn commented 8 years ago

Great! Sounds like a workable direction.

There are a couple other possible pain points I can think of when mapping IPFS paths <-> URIs. Mostly, URIs are specified to be a particular subset of US-ASCII and certain characters have specific meanings, so unless IPFS conventions happen to be compatible, we'll need to use percent-encoding to bridge the gap. The specifics depend on some gory details:

btrask commented 8 years ago

I have a proposal for a content addressing URI scheme here: https://bentrask.com/notes/content-addressing.html. I traded emails with Juan Benet about it a year ago when I first wrote it. Obviously I couldn't convince him.

RFC 6920 proposes a different but similar scheme.

It would be nice if the URI scheme was common between different projects (IPFS, Camlistore, my own). A single content address could be resolved over various different systems depending on context.

Edit: I compared four different proposals here.

jbenet commented 8 years ago

Sorry, I'm late to the party.

Thanks very much @lidel for representing my viewpoint :)

I'm going to try responding only to things i think are unresolved. ask again if i missed something

ipfs://<hash>/<path>

I think @lidel elucidated my viewpoint excellently, and i do not need to express again that this is not desired becuase it complicates things, and forces us to add a 2+ protocol identifiers.

In my view, IPFS gateway information is external to the URI. I think it's analogous to HTTP/FTP/SOCKS proxy information for which tools like curl are configured using separate parameters or environment variables. An IPFS object should always have the same identifier regardless of how it is accessed, and especially in the case of IPFS, I think identity is more fundamental than location.

Exactly right. URLs on the HTTP gateways ARE NOT IPFS Paths/URIs (they contain one, in a larget HTTP URL).

would favor ipfs:/ipfs// over ipfs://. RFC 3986's URI grammar would permit ipfs:object, ipfs:/object, and ipfs://object, but the first two constructs are interpreted as paths, while the third would be interpreted as an authority with no path. Per RFC 2718:

Hmmmm, i'm not sure. I understand the spec... it may be ok to consider /ipfs and /ipns as naming authorities for the purposes of UX. Most people who ever see/use URLs always see them in http://. I think we should support all :, :/, ://, but actually redirect all to :// as the canonical one.

Makes sense. This way a rule for legacy layer would be super simple: always add ipfs: as a prefix to the canonical name. That is all.

I like this, maybe we make :/ canonical? it just looks so odd. and people will be weirded out by it... (we must support :// at least)

What characters are permitted in IPFS path segments? UNIX is arguably way too permissive, Windows reserves far more characters plus certain entire filenames, and various other systems are somewhere in the middle. (Edit: am I reading this right? Any byte sequence in a string is a legal path, provided it starts with e.g. /ipfs//?) Do IPFS paths use a specific character encoding, and if so, which? UNIX doesn't, while Windows and OSX do; potentially ill-formed UTF-16 (prompting creation of WTF-8) and a particular normalization of UTF-8 respectively. (Edit: Links seem to be UTF-8. Is this enforced? Where and how?)

Yes, the paths are supposed to be UTF-8 strings. We should be enforcing it, though i dont think it's being enforced atm.

Is the concept of a query string (?foo) meaningful to a resource addressable at an IPFS URI? (I think probably no.)

Not at this time, though it may become relevant.

Is the concept of a fragment (#foo) meaningful to a resource addressable at an IPFS URI? (I think probably yes.)

yes.

I have a proposal for a content addressing URI scheme here: https://bentrask.com/notes/content-addressing.html. I traded emails with Juan Benet about it a year ago when I first wrote it. Obviously I couldn't convince him.

Sorry @btrask :/ -- i just disagree :)

Edit: I compared four different proposals here.

you didn't compare the proper IPFS URIs, which are paths:

/ipfs/QmeeQhGoyMQc7eQWERE88kFFq4WbdVRrjHctZhH1hPHNds/006/mdag.waist.png
/ipns/QmfVrBzjaXjWWfC7UhFrZnnFFMYA1ENPCjzxAtREaQz8MS/006/mdag.waist.png
/ipns/ipfs.io/docs/install

These are all valid in IPFS. Soon we should also have:

/dns/ipfs.io/docs/install

The first component is the protocol "scheme".


One remaining thing to address: the protocol scheme

I've known for some time now that we're going to need to have + support a protocol scheme identifier, for all the use cases that absolutely require one. _Instead of using only ipfs: for everything, I'm planning to use something that's valid for the entire "Unix Web", that is, a suite of protocols that want to work both on the web and on unix and want the "same identifier" niceness.

I think we should use one of:

unixweb:
nixweb:
nix:
x:

As in:

x:/ipfs/QmeeQhGoyMQc7eQWERE88kFFq4WbdVRrjHctZhH1hPHNds/006/mdag.waist.png
x:/ipns/QmfVrBzjaXjWWfC7UhFrZnnFFMYA1ENPCjzxAtREaQz8MS/006/mdag.waist.png
x:/ipns/ipfs.io/docs/install
x:/dns/ipfs.io/docs/install
x:/bitcoin/<bitcoin-txn>
x:/bittorrent/<magnet-hash>

but happy to hear more suggestions. I know it's rude to use a one-letter schme identifier... but hey... nobody else is using it.

btrask commented 8 years ago

I understand Juan. Sorry for the misleading comparison.

If IPFS addresses are paths, what would you think about simply using file:// URLs?

file:///ipfs/QmeeQhGoyMQc7eQWERE88kFFq4WbdVRrjHctZhH1hPHNds/006/mdag.waist.png
file:///ipns/QmfVrBzjaXjWWfC7UhFrZnnFFMYA1ENPCjzxAtREaQz8MS/006/mdag.waist.png
file:///ipns/ipfs.io/docs/install
lidel commented 8 years ago

From my experience file:// proved to be problematic:

:+1: Yes, it makes IPFS work out-of-the box with legacy software (as long as you have IPFS filesystem mounted via FUSE driver provided by go-ipfs).

:-1: ..but if you don't have root access and/or can't set up FUSE -- bad luck :-1: if you use MS Windows or other non-unix system -- bad luck :-1: a lot of confusion due to "File not found" errors.

IMO file:// should be left as a workaround for people who can set up FUSE on local system and we should come up with a new protocol scheme for canonical use.

As a minimalist I really like the x: scheme described by @jbenet :-)

longears commented 8 years ago

The x:/ipfs/... scheme might be similar enough to a Windows file path (e.g. "drive X", which many people have) that it would be misinterpreted by browsers and auto-converted to file:///X:/ipfs/.... Can someone on Windows check the behavior of browsers when you type that into the URL bar or use it as a link href?

Windows is supposed to use backslashes but forward slashes are often accepted and silently corrected to backslashes.

jbenet commented 8 years ago

@gatesvp or someone else using windows, could you please check what happens with x:// above?

mappum commented 8 years ago

Just tested on Windows, x:/ becomes file:///x:/ (I tried it in the URL bar and as a link href in Chrome). However, strings longer than one character (xx:/) are kept as a protocol.

jbenet commented 8 years ago

@mappum is that the case even if you install a protocol handler for x:// ?

mappum commented 8 years ago

is that the case even if you install a protocol handler for x:// ?

Yes, just tried adding a protocol handler to the registry for x:, the URL was still transformed.

davidar commented 8 years ago

I'm using both ?foo#bar for the ia book reader, so think they should both be supported :)

also :+1: for minimalism

lidel commented 8 years ago

Hm.. xx: is not bad, but how about xn: ? (uniXNamespace)

jbenet commented 8 years ago

may be worth doing:

nix://
nixweb://

wish unix:// wasn't taken by unix sockets.

davidkwast commented 8 years ago

I'd go with nix://

willglynn commented 8 years ago

I like this, maybe we make :/ canonical? it just looks so odd. and people will be weirded out by it... (we must support :// at least)

nix:/ipfs/base58/resource parses as { scheme: "nix", authority: null, path: "/ipfs/base58/resource" }. A single slash denotes an absent authority followed by an absolute path. I think this matches the intent of IPFS (a single global namespace using absolute paths assigned by no central authority), which is why I suggest it as the canonical form.

nix://ipfs/base58/resource parses { scheme: "nix", authority: "ipfs", path: "/base58/resource" }. This breaks IPFS absolute paths since the first path component has moved into the authority part of the URI. I think that makes this a non-starter.

nix:///ipfs/base58/resource parses as { scheme: "nix", authority: "", path: "/ipfs/base58/resource" }. Triple slashes denote an empty authority followed by an absolute path, which is equivalent enough to the no-authority URI that it's not wrong. :/// should be supported, either in addition to or in lieu of :/. Note also that at least one library conflates these two address forms.

jbenet commented 8 years ago

Yesterday we had thought of using get://, but that's not good for non-read functionality. writes, etc.

some more:

nix:// nixweb:// unixweb:// uweb:// dweb:// path:// endpoint:// ep:// unixpath:// up:// fp:// web3://  

nix://ipfs/base58/resource parses { scheme: "nix", authority: "ipfs", path: "/base58/resource" }. This breaks IPFS absolute paths since the first path component has moved into the authority part of the URI. I think that makes this a non-starter.

I dont think it does, the browser tools could undo that change for the user.

my problem with :, :/, and :/// is that it's not what 90% of users will expect to see. regular users have no idea what the hell all of these are for, but hey do know http:// and that's what they're going to type. so we have to support it regardless.

jbenet commented 8 years ago

a worry with nix:// is that it means

  • noun 1. nothing.
    • "apart from that, nix"
  • exclamation 1. expressing denial or refusal.
    • "“I owe you some money.” “Nix, nix.”"
  • verb (NORTH AMERICAN) 1. put an end to; cancel.
    • "he nixed the deal just before it was to be signed"

which is not ideal. this is the sort of thing 99% of internet users will be confused by, so it should be as clear as we can make it

willglynn commented 8 years ago

Attaching IPFS / to :// does break references.

Picture a /ipfs/base58/document referring to related-resource, /ipfs/otherbase58/linked-document, and /ipns/domain.name/. If IPFS were mounted to a UNIX filesystem, these resolve as:

These work as expected – i.e. as paths.

If that same document were retrieved from http://gateway.ipfs.io/ipfs/base58/document, those references become:

These work as expected because gateway.ipfs.io is treated as an authority and the IPFS paths are treated as URI paths.

If that same document were retrieved from foo:/ipfs/base58/document or foo:///ipfs/base58/document, those references become:

Again these work because the authority is constant, and the IPFS paths are mapped to URI paths.

If that same document were retrieved from foo://ipfs/base58/document those references become:

Please do not make foo://ipfs/base58/document the canonical IPFS URI format.

Can this be worked around client-side? Yes, but there are many more clients that assume paths are paths than there are IPFS implementations that would assume something different. I don't want to patch wget and curl and Heritrix and Scrapy and every other tool I use that follows links to have special awareness of IPFS paths just because users are used to typing foo://bar into a browser window.

the browser tools could undo that change for the user

If users typing in foo://bar is an important enough use case to suggest browser tools dedicated to fixing it, then those browser tools should redirect foo://bar to foo:/bar or foo:///bar instead, rather than trying to support base URIs of foo://bar.

jbenet commented 8 years ago

Please do not make foo://ipfs/base58/document the canonical IPFS URI format.

we have to make this work. it's not an option. 99% of people on the internet will try it. I believe that we can teach the browser's nix protocol resolver how to make it work. it may be hacky, but it will prevent massive confusion. (try explaining to your grandmother why foo: and foo:/ and foo:/// work but not foo://, which is, coincidentally, everything she may be used to).

I will add that i understand your post well. im saying we have to work around the limitations.

jbenet commented 8 years ago

don't want to patch wget and curl and Heritrix and Scrapy and every other tool I use that follows links to have special awareness of IPFS paths

None of these know what to do with an ipfs path at all. they will have to be modified, or delegate to the OS protocol handler, which is precisely where we can put a workaround.

willglynn commented 8 years ago

I'm in favor of modifying those tools to be aware of IPFS as a URI scheme. I'm against making tools aware that an apparent reference to foo://ipfs/ipns/domain.name needs to be understood as foo://ipns/domain.name – along with every other special case that needs to be introduced if IPFS / is foo:// – and I'm against defining a URI scheme that requires second-guessing.

Are users really going to enter IPFS URIs by hand, base58 hashes and all? If not, what are we actually trying to support? ipfs://ipns/domain.name/?

jbenet commented 8 years ago

people may copy paste "/ipfs/<hash>/..." and add nix:// beforehand.

and yes, nix://ipns/domain.name

Mithgol commented 8 years ago

Suggestion 1: scheme separation. Make a separate URL scheme for IPNS: ipns://domain.name/ (with optional slashes, so that even ipns:domain.name is also supported). It simplifies the form of the following suggestions.

Suggestion 2: the shortest default. Drop the (now) reduntant part from ipfs://ipfs/… and thus make ipfs:hash or ipfs:hash/somePath the default URL scheme for IPFS files and directories. (Here “the default” means that this scheme must be used in “Copy this document's URL”, “Copy Image Location” and other similar places.) Also support a double-slashed version ipfs://hash or ipfs://hash/path.

Suggestion 3: the lame browsers' crutch. While not making it default, also support the ipfs://ipfs/ipfs/hash / ipfs://ipfs/ipfs/hash/path form. (Here ipfs appears thrice: as a protocol, as a fake hostname and as the first part of the path.) Pronounce that if an IPFS browser does not use its own (“true IPFS”) converter of relative URLs (in IPFS documents) to absolute IPFS URLs, it must use this (highly redundant) form as the document's address (base URL for its relative URLs). For the same reason, ipfs://ipfs/ipns/domain.name URLs should open ipns:domain.name resources.

Suggestion 4: the IPFS→Web gates. Web sites that display IPFS resources should use http://example.org/ipfs/hash or http://example.org/ipfs/hash/path form.

Expected results. From ipfs:hash/dir/doc the possible relative URLs will lead to:

URL's medium base URL relative URL resulting absolute URL
FUSE /ipfs/hash/dir/doc aFile /ipfs/hash/dir/aFile
FUSE /ipfs/hash/dir/doc /ipfs/otherHash/img /ipfs/otherHash/img
FUSE /ipfs/hash/dir/doc /ipns/domain.name /ipns/domain.name
gate http://ipfs.io/ipfs/hash/dir/doc aFile http://ipfs.io/ipfs/hash/dir/aFile
gate http://ipfs.io/ipfs/hash/dir/doc /ipfs/otherHash/img http://ipfs.io/ipfs/otherHash/img
gate http://ipfs.io/ipfs/hash/dir/doc /ipns/domain.name http://ipfs.io/ipns/domain.name
lame browser ipfs://ipfs/ipfs/hash/dir/doc aFile ipfs://ipfs/ipfs/hash/dir/aFile
lame browser ipfs://ipfs/ipfs/hash/dir/doc /ipfs/otherHash/img ipfs://ipfs/ipfs/otherHash/img
lame browser ipfs://ipfs/ipfs/hash/dir/doc /ipns/domain.name ipfs://ipfs/ipns/domain.name
true IPFS browser ipfs:hash/dir/doc aFile ipfs:hash/dir/aFile
true IPFS browser ipfs:hash/dir/doc /ipfs/otherHash/img ipfs:otherHash/img
true IPFS browser ipfs:hash/dir/doc /ipns/domain.name ipns:domain.name
Mithgol commented 8 years ago

You may introduce nix: later to get the following:

URL's medium base URL relative URL resulting absolute URL
lame browser nix://nix/ipfs/hash/dir/doc aFile nix://nix/ipfs/hash/dir/aFile
lame browser nix://nix/ipfs/hash/dir/doc /ipfs/otherHash/img nix://nix/ipfs/otherHash/img
lame browser nix://nix/ipfs/hash/dir/doc /ipns/domain.name nix://nix/ipns/domain.name
true IPFS browser nix:ipfs/hash/dir/doc aFile nix:ipfs/hash/dir/aFile
true IPFS browser nix:ipfs/hash/dir/doc /ipfs/otherHash/img nix:ipfs/otherHash/img
true IPFS browser nix:ipfs/hash/dir/doc /ipns/domain.name nix:ipns/domain.name

Cons:

1) URLs are longer.

2) A catch-all URL handler has to be an umbrella-type handler, i.e. the new URL subtype handlers has to be registered within it (using some new and unknown API) instead of being registered in the browser/OS as top-level URL schemes (using some old and well-known API).

longears commented 8 years ago

Be aware of the OS and package manager already named "Nix" which has been growing in popularity lately. Might not matter. http://nixos.org/

davidar commented 8 years ago

Yesterday we had thought of using get://, but that's not good for non-read functionality. writes, etc.

@jbenet I didn't realise the (legacy) URI format needed to support writes? How would that work? I thought the main purpose was to allow people to view files in current browsers, and writes would require a proper library like ipfs.js, which would presumably use paths rather than URIs?

jbenet commented 8 years ago

@davidar imagine an HTTP POST to an IPNS name, or something. it may be able to work. and this muxing handler is meant to be more general than IPFS, so it may include other protocols with write, etc.

@Mithgol the one thing you're missing in your sugesstions is that the whole point is to always include the full canonical ipfs path. Always include all of

/ipfs/<hash>/<path>
/ipns/<name>/<path>

including starting / and protocol identifier (ipfs, ipns). the point is to preserve unix compativility with the exact same identifiers so that humans can trivially easily copy and paste without edit. (prefix additions being the only allowable crutch)

davidar commented 8 years ago

@jbenet How about at://?

Or to://

Also ref://

Mithgol commented 8 years ago

@jbenet Then, with prefix additions being the only allowable crutch (and unix compatibility being more preferred than shorter URLs), I agree with @davidar that the two-letter URL scheme names are the best. (Because, as @mappum has discovered, schemes can't be any shorter than that.)

However, instead of at:// or to:// (which are good, but too general), I suggest fs: to indicate that a new filesystem type is accessed. (Double-slashed fs:// should also be supported, but not as the default. It should be usual for only a single slash, starting a Unix path, to follow fs:.)

The above table of URLs in browsers would then take the following form:

URL's medium base URL relative URL resulting absolute URL
lame browser fs://fs/ipfs/hash/dir/doc aFile fs://fs/ipfs/hash/dir/aFile
lame browser fs://fs/ipfs/hash/dir/doc /ipfs/otherHash/img fs://fs/ipfs/otherHash/img
lame browser fs://fs/ipfs/hash/dir/doc /ipns/domain.name fs://fs/ipns/domain.name
true IPFS browser fs:/ipfs/hash/dir/doc aFile fs:/ipfs/hash/dir/aFile
true IPFS browser fs:/ipfs/hash/dir/doc /ipfs/otherHash/img fs:/ipfs/otherHash/img
true IPFS browser fs:/ipfs/hash/dir/doc /ipns/domain.name fs:/ipns/domain.name

The latter URLs become 4 character longer than in my first suggestion (which was designed for URL shortness):

Well, four characters are not much.

jbenet commented 8 years ago

Great work! i really like all of these:

at:
to:
ref:
fs:

i think they're all great. form a cs standpoint, i like fs:// most, but from watching humans dictate urls, i like at: ref: to:. (at and to, have pronunciation confusion with @ and 2, but it may be ok).

davidar commented 8 years ago

From a cs standpoint, i like fs:// most, but from watching humans dictate urls, i like at: ref: to:

That was my thought too. HTTP is an awful scheme name, something that can actually be pronounced easily (and conjures roughly the right idea in nontechnical people) would be much better.

wscott commented 8 years ago

BTW, to me nix:// looks like it refers to https://nixos.org/ Which uses a special /nix filesystem to save packages, and would combine with ipfs very nicely.

On Tue, Sep 15, 2015 at 2:58 AM, David A Roberts notifications@github.com wrote:

From a cs standpoint, i like fs:// most, but from watching humans dictate urls, i like at: ref: to:

That was my thought too. HTTP is an awful scheme name, something that can actually be pronounced easily (and conjures roughly the right idea in nontechnical people) would be much better.

— Reply to this email directly or view it on GitHub https://github.com/ipfs/go-ipfs/issues/1678#issuecomment-140299211.

davidar commented 8 years ago

@wscott agreed, nix:// would likely be confusing

larsks commented 8 years ago

I would suggest that any URL scheme other than ipfs: would be extremely confusing and more likely to lead to conflicts down the road. Why are we resisting using the protocol name as the url prefix, which is the bog standard behavior for just about everything else (like http: and ftp: and rtsp: and imap:...)?

BrendanBenshoof commented 8 years ago

If we are going to defy the standard, lets do it elegantly:

I propose

://
jbenet commented 8 years ago

Lars, we have multiple protocols, and we want to support more. Read the full thread for more. 

— Sent from Mailbox

On Tue, Sep 15, 2015 at 11:08 AM, BrendanBenshoof notifications@github.com wrote:

If we are going to defy the standard, lets do it elegantly: I propose

://

Reply to this email directly or view it on GitHub: https://github.com/ipfs/go-ipfs/issues/1678#issuecomment-140423508

larsks commented 8 years ago

@jbenet I saw your post above on that topic, but I don't see any examples of what you mean. What do you mean by "multiple protocols"? The suggestions I've seen so far (nix:, x:, etc) seem destined to cause confusion.

There are examples out there of existing URL schemes for protocol combinations (like qemu+tcp:// vs qemu+ssh:// for libvirt connections, or git+http:// or git+ssh:// for providing pip with information about a git repository...), but I don't know if any of these would be appropriate since it's not clear what your goal is.

jbenet commented 8 years ago

i dont have time to keep repeating myself. my goal is clear enough: https://github.com/ipfs/go-ipfs/issues/1678#issuecomment-139339562 i don't want to be rude and not answer you, so if i ask to read up, please try once more, or look elsewhere for the same information. if i keep answering the same questions over and over, i can't push forward in other pressing things.

The goal: build procotols which bridge unix and the web, by using only path identifiers.

the w3c-recommended protocol scheme discussed here (<something>:) is a workaround for all protocols which share this goal. (/ipfs, /ipns, /dns, /ip4, /ip6, /tcp, /udp for starters, but i have many more in mind: /bittorrent, /bitcoin, /9p ...).

qemu+tcp:// vs qemu+ssh:// for libvirt connections, or git+http:// or git+ssh://

this is absurd and needs to stop. it doesn't layer well at all. it's a hack around the inadequacies of the scheme identifier, and we're not going to be limited by it. anyway, i don't want to discuss again these inadequacies, i want to finalize the workaround. i will give a talk soon enough about all this and the reasoning behind this break. if you'd like me to bring it sooner, open an issue on https://github.com/ipfs/notes/

see also https://github.com/jbenet/random-ideas/issues/28

willglynn commented 8 years ago

I agree with @larsks; it would be surprising to call the scheme anything besides ipfs if the only meaningful paths are defined in the IPFS spec.

RFC 7595 offers some guidance, including:

3.8.  Scheme Name Considerations
…
Schemes SHOULD NOT use names that are either very general purpose or
associated in the community with some other application or protocol.
Schemes also SHOULD NOT use names that are overly general or
grandiose in scope (e.g., that allude to their "universal" or
"standard" nature).

A scheme name is not a "protocol."  However, like a service name as
defined in Section 5 of [RFC6335], it often identifies a particular
protocol or application.  If a scheme name has a one-to-one
correspondence with a service name, then the names SHOULD be the
same.

4.  Guidelines for Provisional URI Scheme Registration
…
 o  If no permanent, citable specification for the scheme definition
    is included, credible reasons for not providing it SHOULD be
    given.

It sounds like there is a master plan to unify many things into unixweb, and I look forward to reading such a proposal. Until then, I think we should narrow the scope of this issue and hammer out a common way to refer to IPFS objects using URIs.