Closed svyatonik closed 10 months ago
Just thoughts:
1) so we need to implement logical XCM channels over single physical. Does it violate the "XCM messages are guaranteed to be delivered and interpreted accurately, in order"? I mean - if we'll be pausing Kusama -> Polkadot messages, but deliver Kusama -> Ethereum messages, is it fine? In other words, this "in order" guarantee works for a logical channel, or for a physical channel;
2) ideally, we'll want to support those logical channels directly in HRMP/DMP/UMP. I.e. in addition to Resume
and Suspend
signals, we'll add Resume(<some-id>)
and Suspend(<some-id>)
and use it when required;
3) other option is to send XCM messages back to the sending chain. But it has its own drawbacks, so I'd prefer to avoid that.
well, yes, actually with HRMP which uses recipient/sender as ParaId
we dont have a possibility to differ,
and I dont know about coming pallet_message_queue
,
e.g. for handling MessageDispatch
we added BridgeInstance
(which is pallet_bridge_message::Pallet
instance id) here
our actual Bridges: ExporterFor
configuration uses bridge location as parachain:
MultiLocation { parents: 1, X1(Parachain(1002))
so it seems, that we should add also target pallet instance for sending, like:
MultiLocation { parents: 1, X2(Parachain(1002), PalletInstance(53))
e.g. 53 is pallet_bridge_messages instance for Kusama->Polkadot bridging here
so e.g. for ethereum we will have different pallet:
MultiLocation { parents: 1, X2(Parachain(1002), PalletInstance(123))
but I dont know yet how pallet_message_queue
works and if it works with MultiLocation
instead of HRMP *recipient/sender** as ParaId
?
@bkontur The problem is not in splitting messages stream into several substreams when they reach the Source.BH. It is that in the current dynamic fees implementation, any bridge (aka lane if we omit the Eth bridge) may suspend the channel with the Source.AH (sending chain), effectively halting all other bridges that are using the same XCMP channel:
/---> (1) Pallet for EthereumBridge
/
Source.AH <--- XCMP channel ---> Source.BH -----> (2) Lane with Polkadot.AssetHub
\
\---> (3) Lane with Polkadot.Collectives
So in the picture above e.g. if relayer with Polkadot.Collectives
is not working or if Polkadot.Collectives
has some other troubles, eventually the "lane with Polkadot.Collectives" will suspend the "XCMP channel" with Source.AH and messages to other bridges won't even be delivered to the Source.BH until "lane with Polkadot.Collectives" will return back to the operational mode. It doesn't matter who is the recipient for those messages - they are simply not physically delivered to the Source.BH.
I've been thinking we may live with that, because logically we may look at the channel between Source.AH and Target.AH as a single channel (with-Polkadot bridge), which splits at the end. So we can't actually do anything here. I even left the big comment here: https://github.com/paritytech/parity-bridges-common/blob/master/modules/xcm-bridge-hub-router/src/lib.rs#L134-L163
But if we talk about multiple remote consensus systems, this analogy doesn't work anymore. Why the with-Polkadot bridge should affect the with-Ethereum bridge. So we probably need to introduce some multiplexing protocol in v2 instead of physically suspending the channel between sending chain and its sibling BH.
But if we talk about multiple remote consensus systems, this analogy doesn't work anymore. Why the with-Polkadot bridge should affect the with-Ethereum bridge. So we probably need to introduce some multiplexing protocol in v2 instead of physically suspending the channel between sending chain and its sibling BH.
I guess, thats what I was talking about, but probably wrong way :D
I was thinking about something like this:
if with-Polkadot bridge
has congestion problem (what ever lane to Polkadot), then suspend all pallet_message_queue
related to the Polkadot bridging pallet specified by MultiLocation { parents: 1, X2(Parachain(1002), PalletInstance(53))
if with-Ethereum bridge
has congestion problem, then suspend all pallet_message_queue
related to the Ethereum bridging pallet specified by MultiLocation { parents: 1, X2(Parachain(1002), PalletInstance(123))
but this assumes that between AssetHub and BridgeHub exists several independent queues:
pallet_message_queue
for BridgeHub+with-Polkadot bridge
pallet_message_queue
for BridgeHub+with-Ethereum bridge
instead of one XCMP channel like now
maybe I have wrong expectation, that new pallet_message_queue
will replace existing XCMP channel (HRMP channel between two paraIds)
like to open dedicated channels:
MultiLocation { parents: 1, X2(Parachain(1002), PalletInstance(53))
MultiLocation { parents: 1, X2(Parachain(1002), PalletInstance(123))
I really need to check pallet_message_queue
PR
Yes, the single HRMP channel is a problem
Ok, I'm proposing the following solutions.
For v2:
1) keep backpressure mechanism for the Target.BH -> Target.AH queue. So we still reject delivery transaction if Target.BH -> Target.AH is overloaded;
2) when we see many messages in the Source.BH -> Target.BH queue, we are sending XCM message to the Source.AH instead of physically suspending the channel Source.AH -> Source.BH;
3) the XCM would be Transact { call: XcmBridgeHubRouter::report_congestion(destination_location, is_congested: true) }
. This call will be the part of the pallet-xcm-bridge-hub-router
at Source.AH and it will only be callable by the bridge hub;
4) we change the pallet-xcm-bridge-hub-router
to maintain multiple fee factors for every route. So e.g. fee for Source.AH -> Target.AH and Source.AH -> Target.Collectives will have the different fee for the same message;
5) when the Source.BH -> Target.BH queue becomes uncongested, we send similar message to the Source.AH: Transact { call: XcmBridgeHubRouter::report_congestion(destination_location, is_congested: false) }
;
6) if we have the congestion report (via report_congestion
), then instead of sending messages over HRMP channel immediately, we keep them in the local storage of the pallet-xcm-bridge-hub-router
. Once we receive the report_congestion(is_congested: false)
call, we can send all messages from the local storage to the Source.BH. I.e. - return to normal operational mode;
7) if we keep receiving messages from Source.AH at Source.BH after we have called report_congestion
, we are closing the bridge.
For v1 - let's do 1-5. We can trust that the asset hub respect our rules.
Also - can we (easily) enable free execution of just some single call (report_congestion
) at the asset hub? I don't want to add more accounts that need to be topped (in this case: sovereign account of bridge hub at asset hub)? For v1 at least it should be fine, because they are system parachains. For v2 we'll probably charge the SA of sending chain at BH when sending this message (or else: close the bridge).
Overall the strategy sounds very good. Just 1 question:
7. if we keep receiving messages from Source.AH at Source.BH after we have called
report_congestion
, we are closing the bridge.
I don't think the Transact { call: XcmBridgeHubRouter::report_congestion(destination_location, is_congested: true) }
will reach the Source.AH immediately. It takes at least 1 more block for the hrmp message to be executed by the Source. If there is some congestion or other problems, maybe it takes more. Can we know when the Source has received the Transact ?
yes, exactly that I wanted to say, that we just need SA of BH on AH for Transact { call: XcmBridgeHubRouter::report_congestion(
, but yes we could also allow UnpaidExecution
for v1 maybe
I think 1-5 looks ok,
and yes, I prepared also the same question for 7.
Can we know when the Source has received the Transact ?
xcm is fire and forget, but we could use ReportStatus
or some xcm query feature
yes, exactly that I wanted to say, that we just need SA of BH on AH for Transact { call: XcmBridgeHubRouter::report_congestion(, but yes we could also allow UnpaidExecution for v1 maybe
@bkontur Ok, cool, thank you. Do you know if it is possible/easy to configure XCM in a way that only such XCM program will be executed for free? Or we need some new barrier for that?
I don't think the Transact { call: XcmBridgeHubRouter::report_congestion(destination_location, is_congested: true) } will reach the Source.AH immediately. It takes at least 1 more block for the hrmp message to be executed by the Source. If there is some congestion or other problems, maybe it takes more. Can we know when the Source has received the Transact ?
and yes, I prepared also the same question for 7.
@bkontur @serban300 I was thinking of some window before closing the bridge. We'll think details later
@bkontur Ok, cool, thank you. Do you know if it is possible/easy to configure XCM in a way that only such XCM program will be executed for free? Or we need some new barrier for that?
afaik, we need new barrier, I can do that, on my way to asset transfer I created at least two hacky Barriers :D, so no problem :D :D or we can allow UnpaidExecution for all BridgeHub xcm calls, but temporray hacky barrier seem ok for v1
afaik, we need new barrier, I can do that, on my way to asset transfer I created at least two hacky Barriers :D, so no problem :D :D or we can allow UnpaidExecution for all BridgeHub xcm calls, but temporray hacky barrier seem ok for v1
No, it's fine - I'll make myself.Thought that maybe there's already some and you're our XCM magician, which knows about it
For v1 it is resolved by https://github.com/paritytech/parity-bridges-common/pull/2318 For v2 we'll have the https://github.com/paritytech/parity-bridges-common/issues/2391 So I'm closing the issue
paritytech/parity-bridges-common#2312 has backported the
pallet-xcm-bridge-hub
router 1:1 from v1 version (#2294). But @bkontur has raised a valid concern here:https://github.com/paritytech/parity-bridges-common/pull/2313#discussion_r1281424022. E.g. if the same chain is bridging with two different consensus systems, is it fine to block both bridges if at least one lane in one of bridges is blocked. With single remote consensus, it is maybe fine, because logically they're using the single channel across the pipeline - it is a bridge specific that messages are delivered over multiple lanes. But if we are bridging with other consensus system, then both bridges will stall until we resolve a single lane issue. It doesn't looks good to me.OTOH for the sending chain, the channel for all outbound messages is the same - it is the channel to the sibling/child bridge hub. So maybe it is fine?
This needs some more thinking, so I'm filing the separate issue.