Update polkadot CAIP-2 and CAIP-10 for XCM v3

bumblefudge commented 1 year ago

Thanks to @ntn-x2 for point out the new GenesisHash function in xcm v3 , and patiently explaining to me the problems with existing CAIP-2 spec!

I merged the existing draft to allow @ntn-x2 to take the pen on an overhaul PR, which as I currently understand it needs the following issues addressed:

[ ] when more detailed xcm v3 docs go live, update links in readme.md
[ ] are there any chains that cannot be addressed by genesis hash?
[ ] should there be a section warning people about migrations? does losing a parachain auction or changing id and restarting as mentioned by joe create a new CAIP-2? if there is any way to find out from a node about a previous NetworkID/CAIP-2 it might be worth mentioning here since the target audience may never have thought about these corner-cases
[ ] Given that 32 char CAIP-2s are allowed, would it make sense to use the whole genesis hash instead of truncating it? it feels like translating to and from absolute-referenced CAIP-2s to relative-referenced Multilocations would necessitate being able to define NetworkID as ByGenesis, so having the whole one might be needed.

ntn-x2 commented 1 year ago

I will look into this ASAP. Anyway few comments below:

are there any chains that cannot be addressed by genesis hash?

As far as I know, genesis hash, for Polkadot networks, is unique, as there cannot be forks. Hence using the genesis hash as the identifier of the network would be enough.

Given that 32 char CAIP-2s are allowed, would it make sense to use the whole genesis hash instead of truncating it?

The whole hash is 32 bytes long, which is 64 HEX characters. How can we fit that into CAIP-2 without truncating? We would have to update the CAIP-2 spec to allow up to 64 characters, which I think would also make sense considering that 32 characters, in HEX, can only identify 16 bytes, while most hashes have 32 byte outputs.

should there be a section warning people about migrations? does losing a parachain auction or changing id and restarting as mentioned by joe create a new CAIP-2?

I will come back to this when I have given it some more thought, but if a CAIP is used as an identifier, I do not see why a chain that has been restarted (i.e., it has a new genesis hash), would have to be linked to a previous chain. Also for security implications, it would be a better idea to consider those chains as completely separate, in my opinion, also because at any time anyone could start the old chain again, and now there are two chains running.

bumblefudge commented 1 year ago

The whole hash is 32 bytes long, which is 64 HEX characters. How can we fit that into CAIP-2 without truncating? We would have to update the CAIP-2 spec to allow up to 64 characters, which I think would also make sense considering that 32 characters, in HEX, can only identify 16 bytes, while most hashes have 32 byte outputs.

Oh duh, yeah, not even base64 can save us here, foiled again. At least keeping half the hash reduces the probability of collisions considerably, but you're in good company, most CAIP-2 profiles are a subset of the hash. I suppose there are ways of passing the second half of the hash out of band, in a query param, etc, although in most cases it's more to confirm information the dapp already has, rather than for the wallet to provide it to them.

should there be a section warning people about migrations? does losing a parachain auction or changing id and restarting as mentioned by joe create a new CAIP-2?

I will come back to this when I have given it some more thought, but if a CAIP is used as an identifier, I do not see why a chain that has been restarted (i.e., it has a new genesis hash), would have to be linked to a previous chain. Also for security implications, it would be a better idea to consider those chains as completely separate, in my opinion, also because at any time anyone could start the old chain again, and now there are two chains running.

OK I was hoping you'd say that. This doesn't mean the section on migrations isn't needed, but it DOES make it much shorter and easier to write: just remind people that when chains migrate, they get new CAIP-2s, and nothing in the CAIP-2 can warn them the chain has prehistory at another CAIP-2. If it were helpful to give people a way to warn recipients of a CAIP-2 that it's a restarted chain, it might be worth encoding that info as a query parameter, i.e.

polkadot:b0a8d493285c2df73290dfb7e61f870f?previousInstance=37e1f8125397a98630013a4dff89b54c

None of that is urgent, though, a one-sentence section is good enough for now. Furthermore, We don't currently have any way of defining query parameters but that could always be defined in a future PR once we have the notion of a query parameter defined in a new CAIP :D

If you find the chainspec of a chain that has been migrated, let me know-- maybe there's a back-pointer to the genesisHash and/or NetworkID written into that JSON file, so that you can fetch it from a node runnign the new chain rather than passing it if needed?

ntn-x2 commented 1 year ago

At least keeping half the hash reduces the probability of collisions considerably, but you're in good company, most CAIP-2 profiles are a subset of the hash.

Yes, I saw that elsewhere. It was just to keep some level of compatibility with a MultiLocation NetworkId::fromGenesisHash([u8; 32]). But I guess we can just say to truncate to the first 16 bytes, all good.

This doesn't mean the section on migrations isn't needed, but it DOES make it much shorter and easier to write: just remind people that when chains migrate, they get new CAIP-2s, and nothing in the CAIP-2 can warn them the chain has prehistory at another CAIP-2.

Again, I think the term migration itself is wrong. A chain that migrates, at least in the blockchain world, should not include a chain restarting from block 0 with a new genesis hash. That's not the proper definition of a blockchain migration.

If you find the chainspec of a chain that has been migrated, let me know-- maybe there's a back-pointer to the genesisHash and/or NetworkID written into that JSON file, so that you can fetch it from a node runnign the new chain rather than passing it if needed?

Curious enough, our chain, KILT, has migrated. We were a parachain of Kusama, and then we migrated to being a parachain on Polkadot. All of the happened maintaining the following invariants:

No alterations to the history of the blockchain
Same genesis hash

That, in my opinion, is a proper definition of migration. Other migrations include a solo chain becoming a parachain, which again would not result in block number going back to 0, or a parachain becoming a solo chain, with the same results. Anything else is a chain restart, and worth being identified by a different CAIP.

As for the queries, I agree their usage would probably have to be re-evaluated once we have a clear definition of query parameters in CAIP.

I think I have a better understanding of the expected outcome of this ticket now 😊

bumblefudge commented 1 year ago

Awesome, we're making good time here!

So a migration would change the networkId that is returned by ByGenesisHash, right? so NetworkId is mutable over time, but querying today's NetworkId for a very old block will still work as if it had never migrated, right? If so, then the migration warning would just need to explain that, and differentiate it from the "restart" case where the CAIP-2 changes.

ntn-x2 commented 1 year ago

ByGenesisHash takes the genesis hash as a parameter, so that means that it's the caller that decides what the hash is, there is no service that returns the genesis hash for a given network, it lives entirely within the runtime of a given blockchain. I would not focus too much on the MultiLocation stuff since they are mostly Polkadot-specific.

What we are trying to agree on here, is how to convert some of the MultiLocation instances to a CAIP, so basically we are deciding what the value of the genesis hash to pass to the NetworkId is 😂

What is out there, is an RPC endpoint, as mentioned in the CAIP, that returns the genesis hash for the blockchain that a given full node is connected to. And here we are saying that's enough to identify a network, as long as such network has never been "restarted".

bumblefudge commented 1 year ago

Got it. I think what's unique here is that in most blockchain namespaces, chainID is static and immutable, and confirmable by the node; here, networkID (and multiLocation base) are mutable, so the warning is less about how to handle migrations, and more about not assuming a networkID works like a chainID :D

ChainAgnostic / namespaces

Update polkadot CAIP-2 and CAIP-10 for XCM v3 #56