Meta Capabilities - Githubissues

zsfelfoldi commented 5 years ago

Abstract

This document proposes a standardized handshake and message/metadata format for peer-to-peer protocols. Implementing this standard in Ethereum-related protocols would make it easy to share mechanisms like flow control or micropayment incentivization. It could also replace devp2p protocol multiplexing which separates protocols completely and instead handle different protocols as not necessarily disjoint sets of supported messages or message/metadata combinations.

Motivation

The following ideas have led me to the current proposal:

Abstract out the flow control mechanism from LES and add it as an optional and replaceable "hint layer" to messages. This would help experimenting with different versions of the mechanism and also help implementing simple clients by making the whole thing an optional performance enhancement and therefore not mandatory to implement.
Add request IDs and (some version of) the LES flow control mechanism to ETH/64. This would make sense because syncing (especially fast sync) relies on heavily pulling data in one direction and therefore faces similar challenges. Adding the current strict and brittle version of the flow control to the main Ethereum wire protocol would probably be dangerous so going with an optional performance-optimizing hint mechanism would suit ETH/64 better too.
Unify LES and ETH/64. Since these protocols share most of their messages and we are thinking about adding LES specific things to ETH anyway, it sounds sensible to have a single protocol. On the other hand these two modes of operation may require different prioritization and load balancing methods. Also ETH is more mission critical while LES is still being heavily developed. Having a common protocol framework would allow sharing some messages, some mechanisms and some code in the implementations but also allow different strategies (both in operation and in development).
Serve Ethereum chain/state data from Swarm. Splitting chain data and state snapshots to smaller parts and storing them in the Swarm topology would theoretically make sense and could help with some use cases. Supporting the proposed protocol framework could also make it easier to realize interoperation between the two protocols.
Micropayment incentivization. This feature will need some extra metainformation which could be added to LES in some specific form but it would be better to make it as flexible as possible. It should be easy to experiment with different payment methods and pricing policies because markets are not designed on a drawing board. Also, other protocols could share the same mechanisms.
Moving towards Serenity and sharding. Serenity clients will also serve different roles and require either different protocols or (ideally) overlapping subsets of the same protocol which can still be developed more or less separately. These protocols could probably also benefit from the previously mentioned mechanisms. Sharing a common protocol framework (with a replaceable serialization format and underlying transport layer if necessary) could allow Serenity to benefit from existing infrastructure parts and make the transition easier.

Specification

Message format

Instead of the usual messageCode, messageData format we allow optional metadata in messageCode, messageData, [[metaCode, metaData], ...] format.

messageCode: a non-negative integer indicating the message type and format. Mapping between message codes and protocol messages is decided during handshake.
messageData: a single serialized object with a format defined in the relevant protocol specification.
a list of metadata fields:
- metaCode: a non-negative integer indicating the metadata type and format. Mapping between meta codes and types of metadata is decided during handshake.
- metaData: a single serialized object with a format defined in the relevant metadata specification.

The only predefined message format is the announcement message with messageCode zero. This message contains a set of string to serialized object mappings and can be used for communicating available messages and metadata, their mapping, and any kind of protocol-specific parameters. Protocol handlers and other agents should "listen" on certain string prefixes and it is up to them to decide whether an announcement was meaningful and useful. Sending an excessive amount of meaningless or unnecessary announcements can result in disconnection.

The set of announcements is encoded in a tree format:

messageData = [name/prefix, [subTree1, ...], value]

where subtrees are encoded similarly.

All other messages are protocol specific. Metadata can be attached to any protocol message if it is understood and accepted by the recipient. Protocol message specification and metadata specification do not need to be aware of each other, it is up to the two actors on each end of the line to decide which combinations of message and metadata types are meaningful and useful for them.

Handshake process

Supported message types are identified by strings in the namespace under the "message/" prefix. Similarly, supported metadata fields are identified by strings with "meta/" prefix. During protocol negotiation messageCode and metaCode values are mapped to those message and metadata types one peer is willing to send and the other is willing to receive.

Instead of listing all capabilities of all supported protocols, the two parties first signal their intent to establish communication through a protocol, optionally also stating their intended role. For example:

Alice to Bob: "connect/les/client"
Bob to Alice: "connect/les/server"

After they agree about speaking les they list the message and metadata formats they are capable of sending, including the types of metadata they can attach to each of the messages (see the "Extension examples" section below). Additionally, protocol parameters can also be exchanged here. (Note that only a few messages and parameters of LES are shown here).

Alice to Bob:
- "meta/reqId"
- "message/les/getHeaders" -> ["reqId"]
- "message/les/getBlocks" -> ["reqId"]
- "message/les/getProofs" -> ["reqId"]
- "param/les/genesisHash" -> genesisHash
Bob to Alice:
- "meta/reqId"
- "meta/flowcontrol"
- "message/les/headers" -> ["reqId", "flowcontrol"]
- "message/les/blocks" -> ["reqId", "flowcontrol"]
- "message/les/proofs" -> ["reqId", "flowcontrol"]
- "message/overload/stop"
- "message/overload/resume" -> ["flowcontrol"]
- "param/les/genesisHash" -> genesisHash
- "param/flowcontrol/bufferLimit": bufferLimit
- "param/flowcontrol/minRecharge": minRecharge

Finally, they list the message and metadata types they are willing to receive and their code mappings:

Alice to Bob:
- "map/meta/reqId" -> 0
- "map/meta/flowcontrol" -> 1
- "map/message/les/headers" -> [3, 0, 1]
- "map/message/les/block"s -> [5, 0, 1]
- "map/message/les/proofs" -> [16, 0, 1]
- "map/message/overload/stop" -> [22]
- "map/message/overload/resume" -> [23, 1]
Bob to Alice:
- "map/meta/reqId -> 0
- "map/message/les/getHeaders" -> [2, 0]
- "map/message/les/getBlocks" -> [4, 0]
- "map/message/les/getProofs" -> [15, 0]

Note: the first integer in the message mapping list is the desired messageCode for the message itself, the subsequent integers are the metaCode identifiers of the metadata types to be attached to the given message type.

Mapping messages allows the other side to send those messages and therefore can be considered the end of the handshake process. Still, further announcements may be sent to update parameters (or even do further mappings) if the protocol specification and implementation allows that. In order to avoid excessive burden on all implementers of a protocol, such updates and late mappings should only be allowed by the protocol spec if they are needed for some use case. If handshake is restricted to the above format and there is a fixed set of supported and required messages and metadata then migrating an existing protocol to this standard should be easy.

Extension examples

An extension can include additional messages and metadata formats. Extensions may be applied to multiple protocols and they can be either optional or a requirement for connecting with certain clients. A few examples of LES mechanisms implemented as extensions:

Request ID

ReqID is a numeric field in request/reply type messages where the reply message simply mirrors the value received in the request. It was first added to LES as a fixed part of the message format which is an option for other protocols too but it can also be implemented as a metadata extension containing a single numeric value:

"meta/reqId": reqId

LES-style client side flow control

LES flow control provides a feedback mechanism for clients to avoid server overload and ensure quick responses.

"param/flowcontrol/messageCost/les/headers": [baseCost, perItemCost]
"param/flowcontrol/bufferLimit": bufferLimit
"param/flowcontrol/minRecharge": minRecharge
"meta/flowcontrol": bufferValue

Overload protection

This is something soon to be implemented in LES. Currently the flow control system instantly drops a connection when the buffer is exhausted. This is very strict policy but it still cannot always avoid transient server faults due to external circumstances, in which case the situation can also be remedied only with dropping clients. Instead of instant disconnection LES will support "freezing" the connection for a few seconds and use the flow control feedback as a "hint layer" intended to avoid freezing or making its occurence sufficiently rare. It will also allow very simple client implementations to not implement flow control at all, making a step in the direction of the proposed modular protocol framework.

"message/overload/stop": reqId (meaning: please stop sending messages now and do not expect a reply for your message reqId or any subsequent ones)
"message/overload/resume" (meaning: now you can start sending messages again)

FrankSzendzielarz commented 5 years ago

Uff! Again, this is a dense read and I think at these early stages of conceptualization it may be more important to communicate the purpose and idea, rather than go straight to formality. Really, one of the main purposes/goals/ambitions of these documents is to help attract stakeholders into the conversation! Formal specs may be counterproductive.

I did have to re-read this several times to "get it" and in the end it's difficult to comment on as a whole because it addresses several different things, some of the methods I think being great ideas, some not. It might be helpful to somehow split out each "requirement" (a motivation in the list of motivations) and explain how each one is addressed and why. (Just a thought)

I will have to spend more time absorbing this , but so far for what it is worth this is my feedback: 1) The handshake seems to be suggesting that peers exchange the messages they are willing to exchange, but this seems to be assuming Request-Reply semantics in all cases (by 'semantics' I mean, protocol semantics, where the protocol semantics is an explanation of what each message means in terms of how the participant should respond). Conversation message exchange patterns may become more complex and may involve more participants? (We know this to be the case with Discovery, for example, where a peer may call FindNeighbors , expecting a Neighbors response, but be provided with a WhoAreYou response instead if the peer is unknown, and then being expected to provide an IAm message prior to getting the Neighbors response, eg: FindNeighbours-> ... <-WhoAreYou? ...->IAm...<-Neighbours)

2) For the flow control , rate limiting message set I am just about to start extending the server rate limiting feedback issue with some concrete ideas. After that's done it will be easier for us to have a way of unifying these ideas and converge on something more concrete.

3) I think we need some sort of list of meta-goals placed somewhere. Aside from 'make it easy for light clients to find and consume light services', we also need something along the lines of 'avoid unintentional centralization, by making the protocols self policing' (to avoid blacklists/whitelists). For example, what if a server advertises a rate limiting policy or daily quota policy, but then behaves completely differently? i.e. What if the hints it transmits are bait? It's the same with the current LES flow control set-up. There's no way the client can know that what the server says is correct. The misbehavior on the server's side could be perhaps because the implementation if flawed, perhaps it's a honey trap to destroy a light network. Is there a way to avoid these scenarios?

fjl commented 5 years ago

Re. "there is no way the client can know that what the server says is correct"

That's not true. For LES, the server either provides service or it doesn't. If the server doesn't provide according to flow control parameters the client can switch servers. This is exactly why client-side flow control is needed: to verify and anticipate server behavior.

FrankSzendzielarz commented 5 years ago

"That's not true" What I mean is is that the server can say "I will offer you a rate limit of X and a quota of Y" but then renege on that promise. This allows a situation where for whatever reason a large number of servers improperly implement the flow control and encourage a situation where light clients maintain whitelists of servers that don't force repeated reconnects.

"This is exactly why client-side flow control is needed: to verify and anticipate server behavior." This is not true. A server may provide messages along the lines of "your quota is about to be exceeded" or may provide messages "server temporarily unavailable" because of capacity allocated to higher priority clients (eg: paid) or because of temporary server degradation, maintenance or whatever. The client does not strictly need the information on server-side [token-bucket] status, but I do propose that this information could be made available, in addition to these other warnings and hints, to make it easier for
clients to handle what to do. What clients should not be encouraged to do is trust these hints on flow control, as there is no way a client can know that what the server promises is trustworthy.

ethereum / devp2p

Meta Capabilities #70

Abstract

Motivation

Specification

Message format

Handshake process

Extension examples

Request ID

LES-style client side flow control

Overload protection