Closed EmmanuellNorbertTulbure closed 1 year ago
Taking a look.
maybe this looks interesting XcmQueryHandler
https://github.com/paritytech/polkadot/pull/6900
Looks like there are 2 error management mechanisms in XCM:
More details here: https://polkadot.network/blog/xcm-part-three-execution-and-error-management
Trying to understand more in-depth how they work, how we can use them and at which chains.
From what I understand, the Asset Trap
works like this: if an XCM message fails while there are some assets in the holding register, those assets will be saved and will be kept on chain.
After that they can be reclaimed using the ClaimAsset
instruction. The ClaimAsset
instruction removes them from the on-chain storage and puts them back in the holding register enabling the caller to perform other XCM operations on them (for example to move them into his account).
Trying to simulate this scenario over a bridge and see exactly how it works.
maybe this looks interesting
XcmQueryHandler
paritytech/polkadot#6900
Thanks ! This seems related to Error reporting
, which we might need as well. Will investigate it also. We might need a combination of Error reporting
and Asset trap
.
Managed to test this locally on rockmine2.
RuntimeCall::PolkadotXcm(pallet_xcm::Call::<Runtime>::execute {
message: Box::new(xcm::VersionedXcm::V3(Xcm(vec![
Instruction::<RuntimeCall>::WithdrawAsset(
vec![MultiAsset {
id: AssetId::Concrete(MultiLocation {
parents: 1,
interior: Junctions::Here,
}),
fun: Fungibility::Fungible(1000000000000),
}]
.into(),
),
]))),
max_weight: Weight::from_parts(10000000, 0),
});
This withdraws 1 KSM from Alice's account in rockmine2. Since we don't do anything with this amount, it is present in the holding register after executing the xcm instruction and the asset trap mechanism will save it on-chain in the rockmine2 storage
RuntimeCall::PolkadotXcm(pallet_xcm::Call::<Runtime>::execute {
message: Box::new(xcm::VersionedXcm::V3(Xcm(vec![
Instruction::<RuntimeCall>::ClaimAsset {
assets: vec![MultiAsset {
id: AssetId::Concrete(MultiLocation {
parents: 1,
interior: Junctions::Here,
}),
fun: Fungibility::Fungible(1000000000000),
}]
.into(),
ticket: MultiLocation { parents: 0, interior: Junctions::Here },
},
Instruction::<RuntimeCall>::DepositAsset {
assets: MultiAssetFilter::Wild(WildMultiAsset::All),
beneficiary: MultiLocation {
parents: 0,
interior: Junctions::X1(Junction::AccountId32 {
network: None,
id: hex_literal::hex!(
"d43593c715fdd31c61141abd04a99fd6822c8558854ccde39a5684e7a56da27d"
)
.into(),
}),
},
},
]))),
max_weight: Weight::from_parts(10000000, 0),
});
This claims the 1 KSM that was trapped and transfers it back to Alice's account on rockmine2.
However there are 2 things to note:
The trapped assets are stored in a hashed form on-chain. So we need to claim the entire asset. If 1000000000000 units were trapped we can't make a partial claim for 1000 for example, because it won't be recognized. So we need to know the exact amount that was trapped and this can be affected by fees that are payed on the way. We need to check if and how we can find out the exact trapped amount in case of a failed xcm transaction over the bridge.
By mistake I happened to execute a ClaimAsset
instruction followed by a DepositAsset
to a wrong account. I got an xcm_error: FailedToTransactAsset("AccountIdConversionFailed")
, but I think I lost the asset. At least it didn't seem to be trapped again and I couldn't find a way to claim it. I will check this further.
Other things still left to be done:
There is a logged issue related to losing assets when DepositAsset
fails: https://github.com/paritytech/polkadot-sdk/issues/912
The problem is that the assets are removed from the holding register and in case of an error they are not put back. And it doesn't happen only for DepositAsset
.
A solution would be to use a custom AssetTransactor
that accounts for these lost assets.
Another thing to pay attention to is this condition here.
There is a "barrier" condition that is checked before processing the XCM message. If this check fails, the XCVM won't be even started, post_process()
will not be called and the assets will be lost (they won't be trapped).
We need to pay attention how to craft the messages in order to avoid triggering this barrier.
maybe this looks interesting
XcmQueryHandler
paritytech/polkadot#6900Thanks ! This seems related to
Error reporting
, which we might need as well. Will investigate it also. We might need a combination ofError reporting
andAsset trap
.
Started to look into Error reporting
. From what I understand both ReportError
and QueryResponse
can only report the status. The assets will still remain trapped on the chain where the execution error occured. And we wouldn't be able to do a combination of ClaimAssets
+ ReportError
/QueryResponse
in order to report back to the source chain and unlock the assets automatically, because the Asset Trap
is executed after the error handler or the appendix. And probably we wouldn't know how many assets to claim anyway.
Another interesting thing is that the AssetTrap
is configurable. Thinking if we could use that.
Trying to summarize the entire flow, to have a better overall picture. This is how it looks like at the present time. It might change. Some items still need research. I will update this comment as the research progresses.
Let's take for example an asset transfer from Rockmine -> BridgeHub Rococo -> BridgeHubWococo -> Wockmint
Rockmine
:The user calls an extrinsic. This extrinsic reserves the assets, prepares the XCM message and sends the XCM message to BridgeHub Rococo
. We don't execute any XCM instruction here. The message will be executed as it reaches the next parachains. The message contains a ReserveAssetDeposited
instruction that will be executed on Wockmint
, which will put the assets in the holding register and a DepositAsset
instruction that will move them in the user's account on Wockmint
.
Possible failures that would involve the assets: BridgeXcmSender::validate()
and BridgeXcmSender::deliver()
are called after reserving the assets. Maybe some other small methods. Anyway, nothing related to the actual XCM message. The message will be executed as it reaches the next parachains.
Mitigations: I think we can call BridgeXcmSender::validate()
before reserving the assets, and maybe some other calls. But BridgeXcmSender::deliver()
needs to be executed after, no matter what. Anyway, we should handle the errors in the code and unlock the assets.
BridgeHubRococo
:Here the XCM message is received and we execute the ExportMessage
instruction, which should send the message over the bridge to BridgeHubWococo
, encoded as a blob.
Possible failures that would involve the assets: Here, the ReserveAssetDeposited
is not called yet, so any failure would lead to losing the assets. They won't even be caught by the Asset Trap
, because they are not in the holding register. Failures could involve the execution of ExportMessage
, or any bridge failures.
Mitigations: TODO, STILL RESEARCHING
. Here we might be able to send a QueryResponse
back to Rockmine
if the XCM fails.
BridgeHubWococo
:Here the XCM message is received as an encoded blob, the bridge logic decodes it and sends it to Wockmint
. No XCM instruction is executed.
Possible failures that would involve the assets: Here, no XCM instruction is executed, so any failure would lead to losing the assets. They won't even be caught by the Asset Trap
, because they are not in the holding register. Mainly there can be problems decoding the blob, or sending the message to Wockmint
, but I think the chances to get an error here are slim.
Mitigations: TODO, STILL RESEARCHING
Wockmint
:Here we finally execute the last part of the XCM message: ReserveAssetDeposited
+ DepositAsset
.
Possible failures that would involve the assets: Failures can be related to XCM message problems or the configuration of the pallet_xcm
here. Maybe fees withdrawal, bad origin, etc. If any error occurs after ReserveAssetDeposited
, the assets will be caught by the Asset Trap
and an Event will be stored related to that. The event will contain the exact amount of assets that were traped.
Mitigations: Rockmint's account
(or whatever the owner of the message was) will be able to execute a ClaimAsset
instruction and reclaim the assets. Also a QueryResponse
instruction could be sent back to Rockmine
. The question is if we could automate the asset recovery here.
TODO, STILL RESEARCHING
Apart from all these we have to be mindful of the XCM barrier on each runtime where XCM instructions are executed: https://github.com/paritytech/parity-bridges-common/issues/2443
As per the guidance from the XCM team:
Considering that the bridge message queue satisfies all the XCM design assumptions (tracked in paritytech/polkadot#2038), I can think of a single issue here: halting the bridge. If the bridge is halted, the in-flight messages might get lost. Thinking of solutions.
@serban300 document the conclusion
The conclusion is the one above: https://github.com/paritytech/parity-bridges-common/issues/2443#issuecomment-1692857792
Closing the issue
To recover from bridge errors.