CAIP-2 (Chain ID): General format definition

webmaster128 commented 4 years ago

This is a refinement of #1 and especially https://github.com/ChainAgnostic/CAIPs/pull/1#issuecomment-534692931 and kudos go to everyone envolved.

The goals of the general chain ID format is:

Uniqueness within the entire blockchain ecosystem
To some degree human readable and helps for basic debugging
Restricted in a way that it can be stored on chain
Character set basic enough to display in hardware wallets as part of a transaction content
Maybe: Can be used unescaped in URL paths
Maybe: Can be used as filename in a case-sensitive UNIX file system (Linux/git).
~Maybe: Can be used as filename in a case-insensitive UNIX file system (macOS).~
~Maybe: Can be used as filename in a Windows file system (that one is going to be fun)~.

5.-8. are open questions to me and I'd love to hear feedback on those. Especially the requirement for a case-insensitive format seems seems to limit the implementation of reference.

chain_id     = a case-sensitive string in the form: interface + ":" + reference
interface    = a case-sensitive string in the form: [-a-z]{3,16}
reference    = a case-sensitive string in the form: [-a-zA-Z0-9]{3,47}

where interface identifies a document describing how reference is composed.

The interface typically corresponds to a class of blockchains and allows delegation of the reference format to an ecosystem-specific standard. Example interfaces are "ethereum" for all the blockchains that can be identified by EIP-155, "bitcoin" for Bitcoin and closely related projects as Litecoin, or "cosmos" for projects that build on top of Tendermint.

reference should strive for uniqieness within the interface. For user-choosen chain IDs (e.g. EIP-155 or Tendermint chain ID), uniqueness cannot be guaranteed and it is reponsibility of registries and communities to resolve collisions.

Examples

# Ethereum mainnet
ethereum:eip155-1

# Bitcoin mainnet (see https://github.com/bitcoin/bips/blob/master/bip-0122.mediawiki#definition-of-chain-id)
bitcoin:bip122-000000000019d6689c085ae165831e93

# Litecoin
bitcoin:bip122-12a765e31ffd4059bada1e25190f6e98

# Feathercoin (Litecoin fork)
bitcoin:bip122-fdbe99b90c90bae7505796461471d89a

# Cosmos Hub (Tendermint + Cosmos SDK)
cosmos:cosmoshub-2
cosmos:cosmoshub-3

# Binance chain (Tendermint + Cosmos SDK; see https://dataseed5.defibit.io/genesis)
cosmos:Binance-Chain-Tigris

# IOV Mainnet (Tendermint + weave)
cosmos:iov-mainnet

# Lisk Mainnet (LIP-0009; see https://github.com/LiskHQ/lips/blob/master/proposals/lip-0009.md)
lisk:lip9-9ee11e9df416b18b

# Random max length (16+1+47 = 64 chars/bytes)
blockchain1-hash:xip3343-8c3444cf8970a9e41a706fab93e7a6c4-xxxyyy

Open questions

[x] Is there a very good reason for a case-insensitive format?
[x] Is 100 ASCII chars a resonable max length?
[ ] What about whitelisting _ as an alternative separator in reference?

Thoughts?

ligi commented 4 years ago

thanks for pushing it forward. Regarding the questions here my thoughts:

I think we should stay case-sensitive - I do not see a good reason for case-insensitivity
sounds good - but I would rather take a power of 2 - so either 64 or 128
don't see a good reason for adding another char - would signal to stay with - or replace - with _

ligi commented 4 years ago

also in your examples I see a lot of duplication - not sure why you add another prefix (like ethereum or bitcoin) - feels very redundant

I thought more like:

eip155-1 bip122-XXX ...

why would we need the extra redundancy of another prefix?

webmaster128 commented 4 years ago

I think we should stay case-sensitive - I do not see a good reason for case-insensitivity

After seeing Binance-Chain-Tigris using upper case characters, I think case-sensitivity is important. This makes the using chain IDs in filenames on Windoes and Mac impossible or at least unreliable, since Binance-Chain-Tigris.json and binance-chain-tigris.json refer to different chains but cause a collision in case-insensitive file systems. But hey, we have Linux for proper file systems ;)

but I would rather take a power of 2 - so either 64 or 128

I was thinking about that too, but did not find a good reason other than the nerdy taste of it. 64 is too short if we want full 32 bytes hashes in hex plus some decoration (which we do). 128 would work.

don't see a good reason for adding another char - would signal to stay with - or replace - with _

If a cosmos chain decides to use _, we can use it 1:1 without needing to hash it, which improves readability in those cases.

why would we need the extra redundancy of another prefix?

I think of this as namespacing. The interface is responsible to ensure uniqueniss just within its namespace. Without this property, a cosmos chain can call itself eip155-1, creating a collision between a Cosmos chain and an Ethereum chain.

ligi commented 4 years ago

After seeing Binance-Chain-Tigris using upper case characters, I think case-sensitivity is important. This makes the using chain IDs in filenames on Windoes and Mac impossible or at least unreliable, since Binance-Chain-Tigris.json and binance-chain-tigris.json refer to different chains but cause a collision in case-insensitive file systems. But hey, we have Linux for proper file systems ;)

yea

I was thinking about that too, but did not find a good reason other than the nerdy taste of it. 64 is too short if we want full 32 bytes hashes in hex plus some decoration (which we do). 128 would work.

100 just feels unnatural for a spec - just because we have 10 fingers .. let's go with 128 then

If a cosmos chain decides to use _, we can use it 1:1 without needing to hash it, which improves readability in those cases.

hm - but this would open the door for much more chars then. Would rather like to restrict the number of chars - but just my gut-feeling and signal signal

I think of this as namespacing. The interface is responsible to ensure uniqueniss just within its namespace. Without this property, a cosmos chain can call itself eip155-1, creating a collision between a Cosmos chain and an Ethereum chain.

I do not see the problem. The prefix is defined in the spec - so eip155-1 means it is an ethereum chain that is chain 1 in the eip155 spec - I do not see the collision there ..

webmaster128 commented 4 years ago

Updated for explicit case-sensitivity
Updated max length to 128 (27+1+100)
Will think about more about widening the character set a little bit, trying to find some more example chains
The collision gets clearer once I created the Cosmos interface. I'll get back to you.

webmaster128 commented 4 years ago

I do not see the problem. The prefix is defined in the spec - so eip155-1 means it is an ethereum chain that is chain 1 in the eip155 spec - I do not see the collision there ..

When you look at https://github.com/ChainAgnostic/CAIPs/issues/5, you see that the content of the reference comes from the genesis directly and is user-defined. So a Cosmos chain can call itself eip155-1, which is a collision to the ethereum spec.

We could potentially merge the interface and the prefix into one thing, e.g.

ethereum<delimiter>1
cosmos<delimiter>ethereum-1

but then the ethereum interface will always need to use EIP-155 and the bitcoin interface will always need to use BIP122. This might acutally be a good idea. The good thing about having the interfaces as namespaces is that this can now be discussed inside of https://github.com/ChainAgnostic/CAIPs/issues/3 and https://github.com/ChainAgnostic/CAIPs/issues/4.

webmaster128 commented 4 years ago

I was thinking about the max length again and am not happy about the long IDs anymore. They are hard to use on any developer's screen and even harder to read on a Ledger device. Also the requirement of storing 128 bytes on chain seems to be too much.

The primary reason for the current length limitation is 32 byte hex hashes. in e.g.

bitcoin:bip122-12a765e31ffd4059bada1e25190f6e98c99d9714d334efa41a195a7e7e04bfe2
lisk:lip9-9ee11e9df416b18bf69dbd1a920442e08c6ca319e69926bc843a561782ca17ee

But in such cases it is not necessary to use the whole hash. We can easily get collision resistence with an 8 byte prefix of the hash, e.g.

bitcoin:bip122-12a765e31ffd4059
lisk:lip9-9ee11e9df416b18b

With an 8 byte prefix, the probability of a collision for 6100 chains (in one interface) is less than 10⁻¹², which should be fine for the foreseable future.

If we use max length 64 instead, I suggest a 16+1+47 split, e.g.

# Random max length (16+1+47 = 64 chars/bytes)
blockchain1-hash:xip3343-8c3444cf8970a9e41a706fab93e7a6c4-xxxyyy

webmaster128 commented 4 years ago

Done in #9

ChainAgnostic / CAIPs

CAIP-2 (Chain ID): General format definition #2

Examples