polkadot-fellows / xcm-format

Polkadot Cross Consensus-system Message format.
Apache License 2.0
176 stars 40 forks source link

Polkadot Cross-Consensus Message (XCM) Format

Version 4. Authors: Gavin Wood.

This document details the message format for Polkadot-based message passing between chains. It describes the formal data format, any environmental data which may be additionally required and the corresponding meaning of the datagrams.

1 Background

There are several kinds of Consensus Systems for which it would be advantageous to facilitate communication. This includes messages between smart-contracts and their environment, messages between sovereign blockchains over bridges, and between shards governed by the same consensus. Unfortunately, each tends to have its own message-passing means and standards, or have no standards at all.

XCM aims to abstract the typical message intentions across these systems and provide a basic framework for forward-compatible, extensible and practical communication datagrams facilitating typical interactions between disparate datasystems within the world of global consensus.

Concepts from the IPFS project, particularly the idea of self-describing formats, are used throughout and two new self-describing formats are introduced for specifying assets (based around Asset) and consensus-system locations (based around Location).

Polkadot has three main transport systems for passing messages between chains all of which will use this format: XCMP (sometimes known as HRMP) together with the two kinds of VMP: UMP and DMP.

1.1 XCM Communication Model

XCM is designed around four 'A's:

The fact that XCM gives these Absolute guarantees allows it to be practically Asymmetric whereas other non-Absolute protocols would find this difficult.

Being Agnostic means that XCM is not simply for messages between parachain(s) and/or the Relay-chain, but rather that XCM is suitable for messages between disparate chains connected through one or more bridge(s) and even for messages between smart-contracts. Using XCM, all of the above may communicate with, or through, each other.

E.g. It is entirely conceivable that, using XCM, a smart contract, hosted on a Polkadot parachain, may transfer a non-fungible asset it owns through Polkadot to an Ethereum-mainnet bridge located on another parachain, into an account controlled on the Ethereum mainnet by registering the transfer of ownership on a third, specialized Substrate NFA chain hosted on a Kusama parachain via a Polkadot-Kusama bridge.

1.2 The XCVM

The XCM format in large part draws upon a highly domain-specific virtual machine, called the Cross-Consensus Virtual Machine or XCVM. XCM messages correspond directly to version-aware XCVM programmes, and the XCVM instruction set represents the repertoire of actions from which an XCM message may be composed.

The XCVM is a register-based machine, none of whose registers are general purpose. The XCVM instruction format, set of machine registers and definitions of interaction therefore compromise the bulk of the XCM message format and most of this document's text is taken up with expressing those definitions.

1.3 Vocabulary

1.4 Document Structure

The format is defined in five main parts. The top-level datagram formats are specified in section 2. The XCVM is defined in sections 3, 4 and 5. The Location and Asset formats are defined in sections 6 and 7. Example messages are specified in section 8.

1.5 Encoding

All data is SCALE encoded. Elementary types are expressed in Rust-language format, for example:

When a bulleted list of types---possibly named---is given, it implies a simple concatenation of individual typed values. For example, a 64-bit unsigned "index" followed by a list of bytes of "data" could be written as:

2 Basic Top-level Format

We name the top-level XCM datatype VersionedXcm. This is defined thus:

This document defines only XCM messages whose version identifier is 2. Messages whose version is lower than this may be defined in earlier versions of this document.

Given that the version of XCM defined at present is 2, thus we can concretize the message format as:

Thus a message is simply the byte 2 to identify the present version, together with a SCALE-encoded series of XCVM instructions.

The effect of any given XCM message is defined as the actions taken by the XCVM when it is initialized properly given the message (which is known in XCVM as the Original Programme) and the location from which the message originated (which is known in XCVM as the Original Origin). The specifics of initialization are defined in the next section.

3 The XCVM Registers

The XCVM contains several machine registers, which cannot generally be set at will, but rather begin with specific values and may only be mutated under certain circumstances and/or obeying certain rules.

The registers are named:

3.1 Programme

Of type Vec<Instruction>, initialized to the value of the Original Programme.

Expresses the currently executing programme of instructions for the XCVM. This gets potentially changed after either the final instruction is executed or an error occurs.

3.2 Programme Counter

Of type u32, initialized to 0.

Expresses the index of the currently executing instruction within the Programme Register. Gets incremented by 1 after each instruction is successfully executed and reset to 0 when the Programme Register is replaced.

3.3 Error

Of type: Option<(u32, Error)>, initialized to None.

Expresses information on the last known error which happened during programme execution. Set only when a programme encounters an error. May be cleared at will. The two internal fields express the value of the Programme Counter when the error occurred and the type of error which happened.

3.4 Error Handler

Of type Vec<Instruction>, initialized to the empty list.

Expresses any code which should run in case of error. When a programme encounters an error, this register is cleared and its contents used to replace the Programme Register.

3.5 Appendix

Of type Vec<Instruction>, initialized to the empty list.

Expresses any code which should run following the current programme. When a programme reaches the end successfully, or after an error where the Error Handler is empty, this register is cleared and its contents used to replace the Programme Register.

3.6 Origin

Of type Option<Location>, initialized so its inner value is the Original Origin.

Expresses the location with whose authority the current programme is running. May be reset to None at will (implying no authority), and may also be set to a strictly interior location at will (implying a strict subset of authority).

3.7 Holding Register

Of type Assets, initialized to the empty set (i.e. no assets).

Expresses a number of assets that exist under the control of the programme but have no on-chain representation. Can be thought of as a non-persistent register for "unspent" assets.

3.8 Surplus Weight

Of type u64, initialized to 0.

Expresses the amount of weight by which an estimation of the Original Programme must have been an overestimation. This includes any weight of instructions which were never dispatched due to an error occurring in an instruction prior, as well as error handlers which did not take effect owing to a successful conclusion and instructions whose weight becomes known after a necessarily conservative estimate.

3.9 Refunded Weight

Of type u64, initialized to 0.

Expresses the portion of Surplus Weight which has been refunded. Not used on XCM platforms which do not require payment for execution.

3.10 Transact Status

Of type MaybeErrorCode, initialized to MaybeErrorCode::Success.

Expresses the result of a Transact instruction after the encoded call within has been dispatched.

3.11 Topic

Of type Option<[u8; 32]>, initialized to None.

Expresses an arbitrary topic of an XCM. This value can be set to anything, and is used as part of XcmContext.

4 Basic XCVM Operation

The XCVM operates as a fetch-dispatch loop common in state machines. The steps of the loop are:

The difference from a basic fetch/dispatch loop is the addition of the Error Handler and Appendix Registers. Notably:

5 The XCVM Instruction Set

The XCVM instruction type (Instruction) is represented as a tagged union (enum in Rust language) of each of the individual instructions including their operands, if any. Since this is SCALE encoded, an instruction is encoded as its 0-indexed place within the instruction list, concatenated with the SCALE encoding of its operands.

The instructions, in order, are:

Notes on terminology

WithdrawAsset

Withdraw asset(s) (assets) from the ownership of origin and place them into the Holding Register.

Operands:

Kind: Command.

Errors:

ReserveAssetDeposited

Asset(s) (assets) have been received into the ownership of this system on the origin system and equivalent derivatives should be placed into the Holding Register.

Operands:

Kind: Trusted Indication.

Safety: origin must be trusted to have received and be storing assets such that they may later be withdrawn should this system send a corresponding message.

Errors:

ReceiveTeleportedAsset

Asset(s) (assets) have been destroyed on the origin system and equivalent assets should be created and placed into the Holding Register.

Operands:

Kind: Trusted Indication.

Safety:origin must be trusted to have irrevocably destroyed the corresponding assets prior as a consequence of sending this message.

Errors:

QueryResponse

Respond with information that the local system is expecting.

Operands:

Kind: Information.

Safety: Since this is information only, there are no immediate concerns. However, it should be remembered that even if the Origin behaves reasonably, it can always be asked to make a response to a third-party chain who may or may not be expecting the response. Therefore the querier should be checked to match the expected value.

Errors:

Response

The Response type is used to express information content in the QueryResponse XCM instruction. It can represent one of several different data types and it therefore encoded as the SCALE-encoded tagged union:

TransferAsset

Withdraw asset(s) (assets) from the ownership of origin and place equivalent assets under the ownership of beneficiary.

Operands:

Kind: Command.

Errors:

TransferReserveAsset

Withdraw asset(s) (assets) from the ownership of origin and place equivalent assets under the ownership of dest within this consensus system (i.e. its sovereign account).

Send an onward XCM message to destination of ReserveAssetDeposited with the given xcm.

Operands:

Kind: Command.

Errors:

Transact

Apply the encoded transaction call, whose dispatch-origin should be origin as expressed by the kind of origin origin_kind.

The Transact Status Register is set according to the result of dispatching the call.

Operands:

Kind: Command.

Errors:

Weight: Weight estimation may utilise max_weight which may lead to an increase in Surplus Weight Register at run-time.

HrmpNewChannelOpenRequest

A message to notify about a new incoming HRMP channel. This message is meant to be sent by the Relay-chain to a para.

Operands:

Safety: The message should originate directly from the Relay-chain.

Kind: System Notification

HrmpChannelAccepted

A message to notify about that a previously sent open channel request has been accepted by the recipient. That means that the channel will be opened during the next Relay-chain session change. This message is meant to be sent by the Relay-chain to a para.

Operands:

Safety: The message should originate directly from the Relay-chain.

Kind: System Notification

Errors:

HrmpChannelClosing

A message to notify that the other party in an open channel decided to close it. In particular, initiator is going to close the channel opened from sender to the recipient. The close will be enacted at the next Relay-chain session change. This message is meant to be sent by the Relay-chain to a para.

Operands:

Safety: The message should originate directly from the Relay-chain.

Kind: System Notification

Errors:

ClearOrigin

Clear the Origin Register.

This may be used by the XCM author to ensure that later instructions cannot command the authority of the Original Origin (e.g. if they are being relayed from an untrusted source, as often the case with ReserveAssetDeposited).

Kind: Command.

Errors: Infallible.

DescendOrigin

Mutate the origin to some interior location.

Operands:

Kind: Command

Errors:

ReportError

Immediately report the contents of the Error Register to the given destination via XCM.

A QueryResponse message of type ExecutionOutcome is sent to destination with the given query_id and the outcome of the XCM.

Operands:

Kind: Command

Errors:

QueryResponseInfo

Information regarding the composition of a query response. QueryResponseInfo:

DepositAsset

Remove the asset(s) (assets) from the Holding Register and place equivalent assets under the ownership of beneficiary within this consensus system.

Operands:

Kind: Command

Errors:

DepositReserveAsset

Remove the asset(s) (assets) from the Holding Register and place equivalent assets under the ownership of dest within this consensus system (i.e. deposit them into its sovereign account).

Send an onward XCM message to dest of ReserveAssetDeposited with the given effects.

Operands:

Kind: Command

Errors:

ExchangeAsset

Remove the asset(s) (want) from the Holding Register and replace them with alternative assets.

Operands:

Kind: Command

Errors:

InitiateReserveWithdraw

Remove the asset(s) (assets) from holding and send a WithdrawAsset XCM message to a reserve location.

Operands:

Kind: Command

Errors:

InitiateTeleport

Remove the asset(s) (assets) from the Holding Register and send an XCM message beginning with ReceiveTeleportedAsset to a destination location.

NOTE: The destination location MUST respect this origin as a valid teleportation origin for all assets. If it does not, then the assets may be lost.

Operands:

Kind: Command

Errors:

ReportHolding

Report to a given destination the contents of the Holding Register. A QueryResponse message of type Assets is sent to the described destination.

Operands:

Kind: Command

Errors:

BuyExecution

Pay for the execution of some XCM xcm and orders with up to weight picoseconds of execution time, paying for this with up to fees from the Holding Register.

Operands:

Kind: Command

Errors:

RefundSurplus

Refund any surplus weight previously bought with BuyExecution.

Kind: Command

Errors: Infallible.

SetErrorHandler

Set the Error Handler Register. This is code that should be called in the case of an error happening.

An error occurring within execution of this code will NOT result in the error register being set, nor will an error handler be called due to it. The error handler and appendix may each still be set.

The apparent weight of this instruction is inclusive of the inner Xcm; the executing weight however includes only the difference between the previous handler and the new handler, which can reasonably be negative, which would result in a surplus.

Operands:

Kind: Command

Errors:

SetAppendix

Set the Appendix Register. This is code that should be called after code execution (including the error handler if any) is finished. This will be called regardless of whether an error occurred.

Any error occurring due to execution of this code will result in the error register being set, and the error handler (if set) firing.

The apparent weight of this instruction is inclusive of the inner Xcm; the executing weight however includes only the difference between the previous appendix and the new appendix, which can reasonably be negative, which would result in a surplus.

Operands:

Kind: Command

Errors:

ClearError

Clear the Error Register.

Kind: Command

Errors: Infallible.

ClaimAsset

Create some assets which are being held on behalf of the origin.

Operands:

Kind: Command

Errors:

Trap

Always throws an error of type Trap.

Operands:

Kind: Command

Errors: Always.

SubscribeVersion

Ask the destination system to respond with the most recent version of XCM that they support in a QueryResponse instruction. Any changes to this should also elicit similar responses when they happen.

Operands:

Kind: Command

Errors:

UnsubscribeVersion

Cancel the effect of a previous SubscribeVersion instruction.

Kind: Command

Errors:

BurnAsset

Reduce Holding by up to the given assets.

Holding is reduced by as much as possible up to the assets in the parameter. It is not an error if the Holding does not contain the assets (to make this an error, use ExpectAsset prior).

Operands:

Kind: Command

Errors: Infallible

ExpectAsset

Throw an error if Holding does not contain at least the given assets.

Operands:

Kind: Command

Errors:

ExpectOrigin

Ensure that the Origin Register equals some given value and throw an error if not.

Operands:

Kind: Command

Errors:

ExpectError

Ensure that the Error Register equals some given value and throw an error if not.

Operands:

Kind: Command

Errors:

ExpectTransactStatus

Ensure that the Transact Status Register equals some given value and throw an error if not.

Operands:

Kind: Command

Errors:

QueryPallet

Queries the existence of a particular pallet type.

Operands:

Sends a QueryResponse to Origin whose data field PalletsInfo containing the information of all pallets on the local chain whose name is equal to name. This is empty in the case that the local chain is not based on Substrate Frame.

Kind: Command

Errors:

ExpectPallet

Ensure that a particular pallet with a particular version exists. Safety Operands:

Kind: Command

Errors:

ReportTransactStatus

Send a QueryResponse message containing the value of the Transact Status Register to some destination.

Operands:

Kind: Command

Errors:

ClearTransactStatus

Set the Transact Status Register to its default, cleared, value.

Kind: Command

Errors: Infallible.

UniversalOrigin

Set the Origin Register to be some child of the Universal Ancestor.

Safety: Should only be usable if the Origin is trusted to represent the Universal Ancestor child in general. In general, no Origin should be able to represent the Universal Ancestor child which is the root of the local consensus system since it would by extension allow it to act as any location within the local consensus.

Operands:

Kind: Command

Error:

ExportMessage

Send a message on to Non-Local Consensus system.

This will tend to utilize some extra-consensus mechanism, the obvious one being a bridge. A fee may be charged; this may be determined based on the contents of xcm. It will be taken from the Holding register.

Operands:

As an example, to export a message for execution on Statemine (parachain #1000 in the Kusama network), you would call with network: NetworkId::Kusama and destination: X1(Parachain(1000)). Alternatively, to export a message for execution on Polkadot, you would call with network: NetworkId:: Polkadot and destination: Here.

Kind: Command

Errors:

LockAsset

Lock the locally held asset and prevent further transfer or withdrawal.

This restriction may be removed by the UnlockAsset instruction being called with an Origin of unlocker and a target equal to the current Origin.

If the locking is successful, then a NoteUnlockable instruction is sent to unlocker.

Operands:

Kind: Command

Errors:

UnlockAsset

Remove the lock over asset on this chain and (if nothing else is preventing it) allow the asset to be transferred.

Operands:

Kind: Command

Errors:

NoteUnlockable

Asset (asset) has been locked on the origin system and may not be transferred. It may only be unlocked with the receipt of the UnlockAsset instruction from this chain.

Operands:

Safety: origin must be trusted to have locked the corresponding asset prior as a consequence of sending this message.

Kind: Trusted Indication

Errors:

RequestUnlock

Send an UnlockAsset instruction to the locker for the given asset. This may fail if the local system is making use of the fact that the asset is locked or, of course, if there is no record that the asset actually is locked.

Operands:

Kind: Command

Errors:

SetFeesMode

Sets the Fees Mode Register.

Kind: Command.

Errors: Infallible

SetTopic

Set the Topic Register

Kind: Command

Errors: Infallible

ClearTopic

Clear the Topic Register

Kind: Command

Errors: Infallible

AliasOrigin

Alter the current Origin to another given origin

Operands:

Errors:

UnpaidExecution

A directive to indicate that the origin expects free execution of the message.

At execution time, this instruction just does a check on the Origin register. However, at the barrier stage, messages starting with this instruction can be disregarded if the origin is not acceptable for free execution, or the weight_limit is Limited and insufficient.

Operands:

Kind: Indication

Errors:

6 Universal Asset Identifiers

Note on versioning: This describes the Asset (and associates) as used in XCM version of this document, and its version is strictly implied by the XCM it is used within. If it is necessary to form a Asset value that is used outside of an XCM (where its version cannot be inferred) then the version-aware VersionedAsset should be used instead, exactly analogous to how Xcm relates to VersionedXcm.

Description

A Asset is a general identifier for an asset. It may represent both fungible and non-fungible assets, and in the case of a fungible asset, it represents some defined amount of the asset.

Since a Asset value may only be used to represent a single asset, there is a Assets type which represents a set of different assets. Sometimes it is needed to express a pattern over the universe of assets; for this purpose there is WildAsset, which allows for "wildcard" matching. Finally, there is often occasion to provide a "selector", which might be a general pattern or a set of concrete assets, and for this there is AssetFilter.

Fungible assets are identified by a class together with an amount of the asset, the number of units of the asset class that the asset value represents. Non-fungible assets are necessarily unique, so need no amount, but this is replaced with an identifier for the asset instance, allowing for multiple unique assets within the same overall class.

Assets classes may be identified in one of two ways: either an abstract identifier or a concrete identifier. A single asset may be referenced from multiple asset identifiers, though will tend to have only a single canonical concrete identifier.

Abstract identifiers

Abstract identifiers are absolute identifiers that represent a notional asset which can exist within multiple consensus systems. These tend to be simpler to deal with since their broad meaning is unchanged regardless of the consensus system in which it is interpreted.

However, in the attempt to provide uniformity across consensus systems, they may conflate different instantiations of some notional asset (e.g. the reserve asset and a local reserve-backed derivative of it) under the same name, leading to confusion. It also implies that one notional asset is accounted for locally in only one way. This may not be the case, e.g. where there are multiple bridge instances each providing a bridged "BTC" token yet none being fungible with the others.

Since they are meant to be absolute and universal, a global registry is needed to ensure that name collisions do not occur.

An abstract identifier is represented as a simple variable-size byte string. As of writing, no global registry exists and no proposals have been put forth for asset labeling.

Concrete identifiers

Concrete identifiers are relative identifiers that specifically identify a single asset through its location in a consensus system relative to the context interpreting. Use of a Location ensures that similar but non-fungible variants of the same underlying asset can be properly distinguished, and obviates the need for any kind of central registry.

The limitation is that the asset identifier cannot be trivially copied between consensus systems and must instead be "re-anchored" whenever being moved to a new consensus system, using the two systems' relative paths. This is specifically because Location values are fundamentally relative identifiers.

Throughout XCM, messages are authored such that when interpreted from the receiver's point of view they will have the desired meaning/effect. This means that relative paths should always by constructed to be read from the point of view of the receiving system, which may be have a completely different meaning in the authoring system.

Concrete identifiers are the generally preferred way of identifying an asset since they are entirely unambiguous.

A concrete identifier is represented by a Location. If a system has an unambiguous primary asset (such as Bitcoin with BTC or Ethereum with ETH), then it will conventionally be identified as the chain itself. Alternative and more specific ways of referring to an asset within a system include:

Format

A Asset value is represented by the SCALE-encoded pair of fields:

If multiple ids need to be expressed in a value, then the Assets type should be used. This is encoded exactly as a Vec<Asset>, but with some additional requirements:

(The ordering provides an efficient means of guaranteeing that each asset is indeed unique within the set.)

A WildAsset value is represented by the SCALE-encoded tagged union with four variants:

Note: different instances of non-fungibles are counted as individual assets

A AssetFilter value is represented by the SCALE-encoded tagged union with two variants:

Standard Ordering

XCM Standard Ordering is based on the Rust-language ordering and is defined:

AssetId

A general identifier for an asset class. This is given by a Location.

AssetInstance

A general identifier for an instance of a non-fungible asset within its class.

Given by the SCALE tagged union of:

WildFungibility

A general identifier for an asset class. This is a SCALE-encoded tagged union (enum in Rust terms) with two variants:

7 Universal Consensus Location Identifiers

This describes the Location (and associates) as used in XCM version of this document, and its version is strictly implied by the XCM it is used within. If it is necessary to form a Location value that is used outside of an XCM (where its version cannot be inferred) then the version-aware VersionedLocation should be used instead, exactly analogous to how Xcm relates to VersionedXcm.

7.1 Description

A relative path between state-bearing consensus systems.

Location aims to be sufficiently abstract in meaning and general in nature that it is able to identify arbitrary logical "locations" within the universe of consensus.

A location in a consensus system is defined as an isolatable state machine held within global consensus. The location in question need not have a sophisticated consensus algorithm of its own; a single account within Ethereum, for example, could be considered a location.

A very-much non-exhaustive list of types of location include:

A Location is a relative identifier, meaning that it can only be used to define the relative path between two locations, and cannot generally be used to refer to a location universally. Much like a relative file-system path will first begin with any "../" components used to ascend into to the containing directory, followed by the directory names into which to descend, a Location has two main parts to it: the number of times to ascend into the outer consensus from the local and then an interior location within that outer consensus.

A Location is thus encoded as the pair of values:

Interior Locations & Junctions

There is a second type InteriorLocation which always identifies a consensus system interior to the local consensus system. Being strictly interior implies a relationship of subordination: for a consensus system A to be interior to that of B would mean that a state change of A implies a state change in B. As an example, a smart contract location within the Ethereum blockchain would be considered an interior location of the Ethereum blockchain itself.

An InteriorLocation is comprised of a number of junctions, in order, each specifying a location further internal to the previous. An InteriorLocation value with no junctions simply refers to the local consensus system.

An InteriorLocation is thus encoded simply as a Vec<Junction>. A Junction meanwhile is encoded as the tagged union of:

NetworkId

A global identifier of an account-bearing consensus system.

Encoded as the tagged union of:

BodyId

An identifier of a pluralistic body.

Encoded as the tagged union of:

BodyPart

A part of a pluralistic body.

Encoded as the tagged union of:

Written format

Note: Locations will tend to be written using junction names delimited by slashes, evoking the syntax of other logical path systems such as URIs and file systems. E.g. a Location value expressed as ../PalletInstance(3)/GeneralIndex(42) would be a Location with one parent and two Junctions: PalletInstance{index: 3} and GeneralIndex{index: 42}.

8 The types of error in XCM

Within XCM it is necessary to communicate some problem encountered while executing some message. The type Error allows for this to be expressed, and is encoded as the SCALE tagged union of: