Problem description

The current MultiAddr spec does not have any good way for dealing with optional protocol parameters that have well defined defaults. Depending on the specific protocol in question different workarounds have been proposed, the predominant theme being recursion:

IPv6 link scopes: /ip6/fe00::32/ip6zone/6/…
TLS Server Name Identification: /tls/sni/example.com/…

This has the obvious problems that:

Each protocol must have a special parser which will then greedily swallow up all following components that it considers relevant
All possible of such “attribute protocol items” must be reserved to ensure that their names are no used as “regular” / “top-level” protocol names
- As protocols evolve this may also cause nasty conflicts between newly defined attributes and existing protocol names.
- While attribute names may be shared between different protocols they must still be treated as a separate class from top-level protocol names since they may never appear top-level while still sharing a common namespace with that top-level class.
It is not immediately obvious to a human reader which items are items of the previous top-level protocol and which constitute the start of a new encapsulation layer

There also do not seem any obvious advantages to this scheme that would somehow make the above problems appear like reasonable trade-offs.

Another proposal suggested in some places (#63) was using plain greediness: After a given protocol item shows up in the path, all further items are swallowed up and used as single “path parameter”:

HTTP: /http/example.com/api/v1 (here example.com is the hostname and /api/v1 the HTTP path base)
WS and WSS: /wss/example.com/api/v1/tls/ws
Unix domain: /unix/path/to/socket.sock/tls/ws

While HTTP arguably is a terminator protocol (meaning that no other protocol may follow it anyways – this notion needs separate discussion!), Unix domain sockets and WebSockets definitely are not. Hence, it is unclear how a parser should figure out that /tls does not refer to a path component and whether this even is the case (the parser would have to proactively probe the file system for this, which is very much not in line with the vision of MultiAddr being a common description of paths to application endpoints; with WebSockets this is not even reliably possible to start with).

The example with WebSockets in particular demonstrates why this cannot work. A suggested alternative was to wrap the path parameter inside some kind of special set of delimiters (different kinds of braces were suggested):

/wss/(/example.com/api/v1)/tls/ws

While this works, it does not take into account the fact that there is nothing usually required about the given parameter: The hostname can usually be inferred from previous protocol levels (and left empty if unknown) and the path may always be empty.

Also potentially relevant data (such as HTTP basic auth) may be missing from the above. By combining the two approaches discussed above we arrive at something similar to the following:

/wss/(/example.com:4443/api/v1)/user/john/password/doh/cookie/bla=blab/tls/ws

Or the following when excluding all attributes:

/wss/()/tls/ws

Neither of these strike the author as particularly intelligible.

This proposal will not attempt to resolve the issues with Unix domain sockets.

Proposed solution

Summary:

Allow each protocol to carry an arbitrary number of keyword arguments whose meaning is protocol dependent
Deprecate existing attribute protocol items: ip6zone
- (Are there more actually standardized at the moment?)

Text-representation syntax

Extending the current spec, each protocol name may now optionally be followed by an opening parenthesis character (() indicating the start of the protocol parameter list. This is to be followed by an arbitrary number of key-value parameters, each delimited by the coma character (,) and terminated by a closing parenthesis character ()). After this closing character a forward slash (/) is expected. If the parameter list is skipped the protocol name should immediately be followed by a forward slash (as is currently the case); an empty parameter list (()) is allowed as well.

Each key-value pair consists of a name, made up only of ASCII lower-case characters, ASCII digits and the ASCII minus sign (-), followed by a single equals sign (=), followed by an arbitrary UTF-8 encoded value. The value may contain any character other then the NUL-byte, but requires escaping of the following characters using a single backward slash (\) if they are to appear inside the value field: opening (() and closing parenthesis ()), the coma character (,) and the backward slash (\) itself. Most importantly the forward-slash (/) does not need to be escaped since it carries no special significance inside protocol parameter list; this allows for easy embedding of paths, like in the following example:

/http(host=example.com,base=/api/v1)
/http(base=/endpoint\(1:2\))

More examples:

/tls(sni=example.com)
/ip6(scope=6)/fe00::32/tcp/80/http
/wss(host=example.com:4443,base=/api/v1,user=john,password=doh,cookie=bla=blab)/tls/ws
- Note: The name host here refers to the HTTP Host-Header and has nothing to do where to connection will actually be made to.
/wss/tls/ws

Each protocol may still accept zero or one static parameters or known or unknown binary length after the final forward-slash. It is expected the use of optional parameters will be minimal in practice (HTTP-y stuff probably being the prominent exception here, not the rule).

(Precise syntax subject to change/bikeshedding!)

Binary-representation syntax

The general format for the binary syntax is:

<BinaryMultiAddr> := (<ProtocolBinary>(<AttributeBinary>*))+

<ProtocolBinary> is the binary MultiAddr representation of the protocol itself and uses the following format:

<ProtocolBinary> := <ProtocolType>([NIL]|<ProtocolValue>|<ProtocolLength><ProtocolValue>)

The format used for the <ProtocolValue> part of the representation depends on the <ProtocolType>:

[NIL] (No value): Used by all protocols with zero static parameters; no value follows and attributes or further protocols may immediately follow.
<ProtocolValue>: Used by all protocols with one static parameter of known binary length; the value, of a length predefined for each protocol type, immediately follows.
<ProtocolLength><ProtocolValue>: Used by all protocols with one static parameter of variable binary length; the <ProtocolLength> is a UVarInt containing the length of the following protocol value.
The mapping between the text and binary representation of the protocol's value may be implemented by an arbitrary protocol-specific function, as long as it is ensured that such transformation may be performed without loss of information with regards to the protocol described. That is, the following constraints must hold:
- text_value ࣃ≃ binary2text(text2binary(text_value))
- binary_value ≃ text2binary(binary2text(binary_value))
- ≃ means “must be equal with regards to the constraints imposed by the protocol” – for instance, DNS names are case-insensitive hence a loss of case may be acceptable as this is not considered relevant “information” in this protocol (XXX: find better wording for this).
Due to this definition it is not possible to parse binary MultiAddrs with unknown protocol values.

<AttributeBinary> is the binary MultiAddr representation of a single protocol attribute and must follow either a protocol binary representation or another attribute. All attributes share a single format:

<AttributeBinary> := [ATTR_TOKEN]<AttributeKey><AttributeLength><AttributeValue>

In this definition:

[ATTR_TOKEN] is a reserved UVarInt indicating the start of an attribute, whose value must not every be used for a <ProtocolValue> (TODO: Decide on a value)
<AttributeKey> is a UVarInt from a table of known attribute names. Attributes in this table are not bound to any specific protocol, it serves only as a look-up table for keeping the binary representation of attributes small.
<AttributeLength> is a UVarInt determining the length of the following <AttributeValue> in bytes.
<AttributeValue> is the UTF-8 encoded text of the attribute's value in the text representation.

TODO: Allow storing unknown attributes in binary, whose names are not in the table?

Other requirements

Unexpected parameters should result in an error when trying to instantiate the given protocol and may result in an error during parsing of the given MultiAddr. For each expected parameter there must be a sensible default value and parameters whose value corresponds to such default value should be omitted from the textual and binary representations. All parameters must be optional, for mandatory parameters the current /protoname/param syntax should be used instead.

EDIT 1: Some language improvements + language-change to always call it an “HTTP path base”, since the path only refers to the path bases used to multiplex different HTTP services of a single hostname and not about referring to actual single files

EDIT 2: Added example for escaping

EDIT 3: Specify binary encoding (but specific to the proposal at hand and for what we already have)

@Stebalien @whyrusleeping @mkg20001 @eyedeekay @mwnx @lgierth: I want feedback! :slightly_smiling_face:

@alexander255 I for my part already created something a bit similar, called forward-addr. https://github.com/Teletunnel/specs/blob/master/SPECS.md and the code https://github.com/Teletunnel/forward-addr Even though the code is mostly ununtilized (but functional!), the ideas might be an inspiration to the proposal (especially the key->value for protocols and the subprotocol-handling parts). I recommend checking out the specs and the code (The code might be more interesting, as the mostly unfinished specs do not fully highlight what one could do with this address-format, such as easily matching incoming traffic by a given set of parameters, and how it's done)

@mkg20001: Not bad!

especially the key->value for protocols

Looking at your linked specs it appears like transliteration in terms of textual representation for MultiAddr would be something like this (ignoring sub-protocols for now):

/tls/.sni/example.com
/ip6/fe00::32/.scope/6/tcp/80/http
/wss/.host/example.com:4443/.base/"/api/v1"/.user/john/.password/doh/.cookie/bla=blab/tls/ws
/wss/tls/ws

Comparing that to the previous recursive approach:

Retains the “very much like directory traversal”-style appearance
Fixes the issue of attributes and protocols sharing a namespace although the work fundamentally differently
Still makes it somewhat hard for human readers to figure out what attribute belongs to what

Compared to the proposed syntax: Equivalent is terms to expressive power and parsing properties, just a different syntax. (I like mine more, but that's subjective! Then again, this one has a more path-like structure which I don't really consider a plus for attributes – others might through.)

easily matching incoming traffic by a given set of parameters

Maybe I'm missing something, but it's this part (while cool for the job you envisioned) pretty irrelevant here since MultiAddr is about establishing connections, not filtering them? Even when using MultiAddr to bind to a port or path on a web server this wouldn't be useful?

subprotocol-handling parts

The only example I could find about this was /http/.path/"/myservice"/_ws/stream where WebSocket will be a sub-protocol for HTTP. Somehow this isn't very convincing to me since, in terms of dialing, I still need to establish a separate connection with special properties (upgrade request) for that “sub-protocol” that has only limited resemblance with other HTTP connections. A connection to …/http/.path/"/myservice" will not be HTTP (but WebSockets) for the client, and a client attempting to use it otherwise would very likely receive errors. It's more like the first two messages of a WebSockets handshake happen to look like HTTP for compat reason but are otherwise unrelated. Looking at the HTTP path parameter itself, thinking about is as the path that will end up being sent over HTTP is misleading: is actually just a path base parameter that specifies the location (inside HTTP) where the given target application is mounted – the entire space below that path is the dialing endpoint, not just that single path itself. With WS however we're actually referring to a single path instead. (I have also updated the language in my proposal when referring to HTTP to make this clearer.)

Maybe I'm missing something, but it's this part (while cool for the job you envisioned) pretty irrelevant here

Yes, it is. It was created for a separate project (https://github.com/Teletunnel/Teletunnel-Core) which was a proposed improvement to https://telebit.cloud 's config format. But I saw your idea and immediately noticed it had many similarities with that spec.

It's more like the first two messages of a WebSockets handshake happen to look like HTTP for compat reason but are otherwise unrelated.

My reasoning was that the Upgrade header, which should be part of the HTTP and not WebSocket-specific spec (just guessing, though), can be used to tell the client/server that a different connection protocol is being used, starting from that message. And...

A connection to …/http/.path/"/myservice" will not be HTTP (but WebSockets) for the client,

...the message still includes the path. Which I needed to match WebSocket connections for specific paths. Otherwise I would have to write two modules for HTTP, one just for WebSocket upgrade message header parsing so that matching wss://some-host/some-path would work. Unsure if multiaddr needs this.

Oh and I noticed you may have misinterpreted the spec a bit: /wss/.host/example.com:4443/.base/"/api/v1"/.user/john/.password/doh/.cookie/bla=blab/tls/ws would actually be /tcp/.port/443/tls/.sni/example.com/http/.user/john/.password/doh/.cookie/bla=blab/.path/"/api/v1"/_ws. That's where _ws gets useful

Many thanks for writing this up so eloquently, @alexander255. Let me just rant a little bit and dump some rambling thoughts.

Mapping your proposal to challenges we needed to solve:

Knowing where a component starts and where it stops => in your proposal all components are of length 1 (no params) or 2 (with params). The params are clearly demarcated by an opening and a closing char: brackets in the textual form (TBD in the binary one). We'd have to find a way to escape them though.
Attaching multiple parameters/dimensions to components => your proposal addresses this by providing inline maps.
Modelling richer semantics and structures. More below.

One thing that I find unclean about the current multiaddr is that it mixes locators (IP addresses and ports), with network protocols (tcp, udp), with higher level protocols (onion, quic), with application protocols (http), with libp2p facilities (p2p-circuit), with identity assertions (peer ID).

As a result, for a given multiaddr to work you need these components to be assembled in specific recipes which might not be obvious from the get-go (e.g. /p2p-circuit/QmRelay/QmTarget). While the components are self-describing in themselves, the overall meaning of the multiaddr is not self-descriptive.

Actually, I'd argue that the "slurping" model of protocol handlers (eagerly parsing the tail) makes the entire meaning of the multiaddr dependent not only on code, but on the version of the code that a node is running. If we want multiaddrs to become a standard, we need a higher degree of formality. Each component might expose a schema, and compositionality may emerge from those schemas.

So returning to the topic of semantics, as a community we ought to deeply reflect on all the possible things a multiaddr can represent: location, identity assertions, protocol layering, routing, tunneling, protocol selection, etc. Then we need to think about the combinatorics behind all of these elements.

At the end of this exercise, we might come up with an entirely different structural model to reason about and encode multiaddrs.

Many thanks for writing this up so eloquently, @alexander255. Let me just rant a little bit and dump some rambling thoughts.

Thanks! :slightly_smiling_face:

Mapping your proposal to challenges we needed to solve:

* Knowing where a component starts and where it stops => in your proposal all components are of length 1 (no params) or 2 (with params). The params are clearly demarcated by an opening and a closing char: brackets in the textual form (TBD in the binary one). We'd have to find a way to escape them though.

The proposal includes a strategy for escaping the 4 sensitive characters:

The value may contain any character other then the NUL-byte, but requires escaping of the following characters using a single backward slash (`\`) if they are to appear inside the value field: opening (`(`) and closing parenthesis (`)`), the coma character (`,`) and the backward slash (`\`) itself.

I've added an example of this to the proposal:

/http(base=/endpoint\(1:2\))

The proposal does not currently include anything about the possible mandatory argument: The question how to embed paths for Unix Domain Sockets is not addressed by this proposal, but I'd imagine similar rules could be applied there.

Of course this proposal still doesn't make it possible to decapsulate protocols from the end of the stack without parsing all their preceding values.

One thing that I find unclean about the current multiaddr is that it mixes locators (IP addresses and ports), with network protocols (tcp, udp), with higher level protocols (onion, quic), with application protocols (http), with libp2p facilities (p2p-circuit), with identity assertions (peer ID).

As a result, for a given multiaddr to work you need these components to be assembled in specific recipes which might not be obvious from the get-go (e.g. /p2p-circuit/QmRelay/QmTarget). While the components are self-describing in themselves, the overall meaning of the multiaddr is not self-descriptive.

I agree with you on the last 3: Maybe MultiAddr should be limited to stream/dgram protocols only?

Including HTTP is problematic for instance, because “establishing a HTTP connection” will not give me an “HTTP Connection”, but a TCP (or TLS) connection that I expect to be able to send HTTP messages over. Similar things apply to the libp2p facilities which also are more assertions of the kind: After establishing a stream using the previous layers, expect to be able to send messages of type X. Of course adding these kinds of assertions to the end of the protocol stack means they don't need to be sent out of band, which is useful if the application supports more than one endpoint protocol.

I guess it depends on what you want the addresses to represent. Does /p2p-circuit work over arbitrary stream transports? I mean I could in theory do /udp/80/http but the result is likely not what I'd want, so I don't really see something like “HTTP requires a reliable transport” as an issue. If I assemble them “wrong” I still end up with a result, but I may not be happy with the side-effects of my decisions (and call it SSDP).

Actually, I'd argue that the "slurping" model of protocol handlers (eagerly parsing the tail) makes the entire meaning of the multiaddr dependent not only on code, but on the version of the code that a node is running. If we want multiaddrs to become a standard, we need a higher degree of formality.

If by “slurping” you’re referring to what I called “recursion” in the problem section, then you have my fullest agreement. Otherwise please elaborate!

So returning to the topic of semantics, as a community we deeply reflect on all the possible things a multiaddr can represent: location, identity assertions, protocol layering, routing, tunneling, protocol selection, etc. Then we need to think about the combinatorics behind all of these elements.

At the end of this exercise, we might come up with an entirely different structural model to reason about and encode multiaddrs.

A grand statement! How do we do that without all talk dwindling into nothing? :wink: (No offence, but your statement does invite such kind of “resolution” making me genuinely concerned.)

@mkg20001:

Oh and I noticed you may have misinterpreted the spec a bit: /wss/.host/example.com:4443/.base/"/api/v1"/.user/john/.password/doh/.cookie/bla=blab/tls/ws would actually be /tcp/.port/443/tls/.sni/example.com/http/.user/john/.password/doh/.cookie/bla=blab/.path/"/api/v1"/_ws. That's where _ws gets useful

The problem with that is that there actually isn't any path parameter in the sense your are using through: The path parameter for HTTP is the base path where some HTTP content is “mounted”, not the path of any individual resource. So, if anything, the path would be more like

/tcp/.port/4443/tls/.sni/example.com/http/.user/john/.password/doh/.cookie/bla=blab/.base/"/api/v1"/_ws/.path/"/socket"

unless "/api/v1" really where your WebSocket path additionally to being you HTTP base path. All in all, I'd think that nesting WebSocket like this is more confusing than enlightening and even if the protocols are obviously related they are more of the “happens to look similar”-kind related (as is evidenced by the fact that there is no WebSockets of HTTP/2 for instance). (Regarding HTTP's base paths: See also my explanation of endpoint vs content on the other issue.)

Updated the proposal to include a description of a compact representation of the protocol's attributes, as well as a description of the current binary format in general.

Looking at the related issues before this comment, it appears that this proposal would solve real problems that people (not just me) have with the current state of affairs in Multiaddr. Any way to move this discussion forward?

Awesome ideas @ntninja! I just wanted to give my thoughts and ideas on this as well. Please correct me if I missed anything I just discovered this thread recently.

Components Just as in traditional URLs (and your proposal @ntninja) a / introduces a new address component. A component always inherits the context of its parent since each action depends on its predecessor.

For example ip(1.2.3.4)/udp(500) instructs the reader to first create an IP context to ip 1.2.3.4 and then inside that context open a UDP connection to port 500.
Parameters Since Components can be basically thought of as function invocations, it is only natural to include parameters. This is already captured very well by you proposal since /ip(1.2.3.4) already looks a lot like a function invocation in most programming languages this should be a no-brainer for developers to understand. I'd just suggest a special default parameter style, since a lot of protocols only need 1 parameter we can keep it short by writing:
```
/<name>(<parameter value>)
```
And only using
```
/<name>(<key>=<value>,<key>=<value>,...)
```
For situations where one needs to clarify which parameter is which. Protocol specifications would then include if it has a default parameter is if so which one.

I know this makes it reliant on the version of the protocol, but if we're honest, this would not change by naming the parameter. Different versions of the same codec might still require different parameters.

Advantages

Address structure is clear at a glance Since each / indicates a new step in the connection, it is very clear where components start and where they end. It is also clear what in the addresses is a protocol step and what is its parameters. This makes it easy to distinguish what part of the address is a parameter and what is not, it shows the Structure of the address a lot better and it is in shorter in most situations that the solution proposed by @mkg20001.
Address can easily be shortened for different display situations Consider the following address /dns(example.com)/tcp(80)/http(/index.html) From this address it is very clear what parts of the address are important and which parts are not. A browser for example might choose to omit all references to the protocols and only display example.com/index.html or on a smartphone even example.com.

A couple examples:

/ip4/1.2.3.4/tcp/443/tls/sni/example.com/http/example.com/index.html
=>
/ip4(1.2.3.4)/tcp(443)/tls/sni(example.com)/http(host=example.com, path=/index.html)

note: I don't think the host parameter is (or should be) necessary since the hostname to connect to has already been established by a previous step ( the sni component) and is therefore superfluous information.

/tls/.sni/example.com => /tls/sni(example.com)

/ip6/fe00::32/.scope/6/tcp/80/http => /ip6(address=fe00::32, scope=6)/tcp(80)/http

/wss/.host/example.com:4443/.base/"/api/v1"/.user/john/.password/doh/.cookie/bla=blab/tls/ws 
=>
/wss(host=/example.com:4443, base="/api/v1", user=john, password=doh, cookie="bla=blab")/tls/ws

I don't know how this should be handled but I'd say that Addresses should be mostly whitespace insensitive. This just makes It easier to read and whitespace don't change the meaning anyway. (except for names and stuff I know ;) )

Some comments:

While I like your proposal of adding protocol arguments in the way may programming languages do (/ip(1.2.3.4)), it does gloss over one important detail: It's not backward-compatible with existing addressing, so this would have to be some kind of Multiaddrv2.

In my proposal I tried to be sure that it would be as backward-compatible as possible with existing addressing data. (But it's necessarily not forward-compatible, in that new addresses following the proposed scheme would work on older implementations.) I, personally, wouldn't be against it and there could be some kind of conversion scheme implemented, but it's probably better to keep these things somewhat separately.

/ip4/1.2.3.4/tcp/443/tls/sni/example.com/http/example.com/index.html
=>
/ip4(1.2.3.4)/tcp(443)/tls/sni(example.com)/http(host=example.com, path=/index.html)

I think your example is somewhat off here. It would be more like

/ip4/1.2.3.4/tcp/443/tls/sni/example.com/http/example.com/index.html
=>
/ip4(1.2.3.4)/tcp(443)/tls(sni=example.com)/http(host=example.com, path=/index.html)

I do dislike the fact that parameters can be given as both positional and keyword arguments in your proposal. IMHO, it should be either one or the other. For instance:

/ip6/fe00::32/.scope/6/tcp/80/http => /ip6(fe00::32, scope=6)/tcp(80)/http

This should make the string representation non-ambiguous, except for ordering of keyword arguments and spacing (if we allow that). In particular it means that the canonical string representation can always be constructed even if not all parts of the address are known. And the SNI value would be optional and default to “something sensible”: In this case either the preceeding IP address or the following hostname.

(Of course, non of this actually matters as Protocal Labs just ignores pretty much everything MultiFormats unless they need a change for themselves.)

For the brave soul that picks this up at some point, food for thought: alternative, concise notation based on Matrix URIs described in https://www.w3.org/DesignIssues/MatrixURIs.html

/ip4/127.0.0.1/tcp/8080/tls;sni=example.com/
/ip4/127.0.0.1/tcp/8080/http;hostname=example.com;base=/api/v1;user=john;password=doh;cookie=bla/
/ip4/127.0.0.1/tcp/8080/tls;sni=example.com;/http;hostname=something-else.com

multiformats / multiaddr

Proposal: Add keyword arguments to protocols #87