multiformats / multiaddr

Composable and future-proof network addresses
https://multiformats.io/multiaddr
MIT License
426 stars 85 forks source link

multiaddr

Composable and future-proof network addresses

Introduction

Multiaddr aims to make network addresses future-proof, composable, and efficient.

Current addressing schemes have a number of problems.

  1. They hinder protocol migrations and interoperability between protocols.
  2. They don't compose well. There are plenty of X-over-Y constructions, but only few of them can be addressed in a classic URI/URL or host:port scheme.
  3. They don't multiplex: they address ports, not processes.
  4. They're implicit, in that they presume out-of-band values and context.
  5. They don't have efficient machine-readable representations.

Multiaddr solves these problems by modelling network addresses as arbitrary encapsulations of protocols.

Multiaddr was originally thought up by @jbenet.

Interpreting multiaddrs

Multiaddrs are parsed from left to right, but they should be interpreted right to left. Each component of a multiaddr wraps all the left components in its context. For example, the multiaddr /dns4/example.com/tcp/1234/tls/ws/tls (ignore the double encryption for now) is interpreted by taking the first tls component from the right and interpreting it as the libp2p security protocol to use for the connection, then passing the rest of the multiaddr to the websocket transport to create the websocket connection. The websocket transport sees /dns4/example.com/tcp/1234/tls/ws/ and interprets the tls in this context to mean that this is going to be a secure websocket connection. The websocket transport also gets the host to dial along with the tcp port from the rest of the multiaddr.

Components to the right can also provide parameters to components to the left, since they are in charge of the rest of the multiaddr's interpretation. For example, in /ip4/1.2.3.4/tcp/1234/tls/p2p/QmFoo the p2p component has the value of the peer id and it passes it to the next component, in this case the tls security protocol, as the expected peer id for this connection. Another example is /ip4/.../p2p/QmR/p2p-circuit/p2p/QmA, here p2p/QmA is passed to p2p-circuit and then the p2p-circuit component knows it needs to use the rest of the multiaddr as the information to connect to the relay node.

This enables nesting and arbitrary parameters. A component can parse arbitrary data with some encoding and pass it as a parameter to the next component of the multiaddr. For example, we could reference a specific HTTP path by composing path and urlencode components along with an http component. This would look like /dns4/example.com/http/GET/path/percentencode/somepath%2ftosomething. The percentencode parses the data and passes it as a parameter to path, which passes it as a named parameter (path=somepath/tosomething) to a GET request. A user may not like percentencode for their use case and may prefer to use lenprefixencode to have the multiaddr instead look like /dns4/example.com/http/GET/path/lenprefixencode/20_somepath/tosomething. This would work the same and require no changes to the path or GET component. It's important to note that the binary representation of the data in percentencode and lenprefixencode would be the same. The only difference is how it appears in the human-readable representation.

Use cases

Encapsulation based on context

Although multiaddrs are self-describing, it's possible to further encapsulate them based on context. For example in a web browser, it's obvious that, given a hostname, HTTP should be spoken. The specifics of this HTTP connection are not important (except maybe the use of TLS), and will be derived from the browser's capabilities and configuration.

  1. example.com/index.html
  2. /http/example.com/index.html
  3. /tls/sni/example.com/http/example.com/index.html
  4. /dns4/example.com/tcp/443/tls/sni/example.com/http/example.com/index.html
  5. /ip4/1.2.3.4/tcp/443/tls/sni/example.com/http/example.com/index.html

The resulting layers of encapsulation reflect exactly how the bidirectional stream between client and server is constructed.

Now you can imagine how based on the browser's configuration, the multiaddr might look different. For example you could use HTTP proxying or SOCKS proxying, or use domain fronting to evade censorship. This kind of proxying is of course possible without multiaddr, but only with multiaddr do we have a way of consistently addressing these networking constructions.

Specification

Multiaddr and all other multiformats use unsigned varints (uvarint). Read more about it in multiformats/unsigned-varint.

Encoding

TODO: specify the encoding (byte-array to string) procedure

Decoding

TODO: specify the decoding (string to byte-array) procedure

Protocols

See protocols.csv for a list of protocol codes and names, and protocols/ for specifications of the currently supported protocols.

TODO: most of these are way underspecified

Implementations

TODO: reconsider these alpha/beta/stable labels

Contribute

Contributions welcome. Please check out the issues.

Check out our contributing document for more information on how we work, and about contributing in general. Please be aware that all interactions related to multiformats are subject to the IPFS Code of Conduct.

Small note: If editing the README, please conform to the standard-readme specification.

License

This repository is only for documents. All of these are licensed under the CC-BY-SA 3.0 license, © 2016 Protocol Labs Inc. Any code is under a MIT © 2016 Protocol Labs Inc.