Closed fewensa closed 2 years ago
Finality relay reads finalized headers from the source chain and submits them to the target chain. It does that if source_chain.best_finalized_header.number > target_chain.best_finalized_source_header.number
. For connecting to the source/target chain it uses FinalitySource
and FinalityTarget
clients accordingly.
Normally when you start standalone finality relay (i.e. substrate-relay relay-headers bla-bla-bla
), these 'clients' are returning actual finalized blocks from both chains. It means that every time finalized header appears at the source chain, it is submitted to the target chain with its finality proof. Verifying finality proof onchain isn't cheap - e.g. on Kusama/Polkadot there are ~1K validators, which means we'll need to verify ~2/3*1K signatures onchain.
So there's that on-demand relay, which only relay headers when other application (right now - messages relay) needs it. E.g. if message relay sees new message at source block N
, then it'll asks on-demand relay to sync that header to the target chain. Once it is synced, message may be relayed too. So we are importing only required headers, meaning that the bridge operational cost is decreased.
Under the hood, it is implemented here. So if there are no required headers, finality source would return 0
as best finalized header number. Meaning that source_chain.best_finalized_header.number > target_chain.best_finalized_source_header.number
is not satisfied and finality relay would do nothing. Once message relay reports needs some header, the maximal_header_number
is set to this number and finality relay will sync this header.
In addition, what is the difference between finality_pipeline
This seems like incomplete question :) Could you, please, rephrase?
Thanks for your explanation.
The type of required_header_number
I overlooked.
It's Mutex,
https://github.com/paritytech/parity-bridges-common/blob/6d0daa8975a57c23b6705a725b18bc4960547d42/relays/client-substrate/src/finality_source.rs#L34-L35
So I'm confused about how to affect finality_relay. if the type is Mutex. this does work, and affects finality_relay
https://github.com/paritytech/parity-bridges-common/blob/6d0daa8975a57c23b6705a725b18bc4960547d42/relays/lib-substrate-relay/src/on_demand_headers.rs#L336
In addition, what is the difference between finality_pipeline
About this question, I think I know that.
https://github.com/paritytech/parity-bridges-common/blob/6d0daa8975a57c23b6705a725b18bc4960547d42/relays/lib-substrate-relay/src/finality_pipeline.rs#L157
In finality_pipeline the maximal_header_number
is `None, so this is not an on-demand relay.
Hi @svyatonik
About on_demand_headers
I have some new questions.
Question 1:
https://github.com/paritytech/parity-bridges-common/blob/cf9abc101110b202675cce8226b2b96865195336/relays/bin-substrate/src/cli/relay_headers_and_messages.rs#L461-L462 https://github.com/paritytech/parity-bridges-common/blob/cf9abc101110b202675cce8226b2b96865195336/relays/bin-substrate/src/cli/relay_headers_and_messages.rs#L478-L479
Why do both parties here affect a headers-relay at the same time?
Doing so may update required_header
at the same time
https://github.com/paritytech/parity-bridges-common/blob/cf9abc101110b202675cce8226b2b96865195336/relays/messages/src/message_race_loop.rs#L320
Question 2:
Maybe due to network reasons, messages-relay is executed before headers-relay, then updated required_header_number
, Will never pass this condition again (This happened on our bridge ).
I'm tried to add sleep to https://github.com/darwinia-network/bridger/blob/96cf8321235bf0a0ed4b69e01a6e95e9dca42d4b/task/task-pangolin-pangoro/src/service/relay.rs#L149 it works.
std::thread::sleep(std::time::Duration::from_secs(5));
So I’m more troubled by this, it’s so designed or there are other reasons?
Question 1:
Message relay serves lane in both directions - e.g. when you run substrate-relay relay-headers-and-messages --lane=00000000
, it means that it'll deliver messages from chain1 to chain2 and in opposite direction. Plus even one-direction message lane requires two-directional channel - that's because source chain needs to know when message has been dispatched => even unidirectional relay will submit transactions to both source and target nodes => it'll need headers of source at target chain && header of target at source chain.
Question 2:
I'll need more time and some instructions to understand what's happening with your bridge. I didn't quite get it - "Maybe due to network reasons, messages-relay is executed before headers-relay" which relay are you starting - headers relay, messages relay or headers-and-messages relay?
Question 2:
I'm hacked dependent code in the local registry. added log before line@281
log::debug!(
target: "bridge",
"{} *1 scan range, \
best_finalized_source_header_at_source: {:?}, \
best_finalized_source_header_at_target: {:?}, \
current_headers_difference: {:?}, \
maximal_headers_difference: {:?}, \
required_header_number: {:?}",
C::NAME,
best_finalized_source_header_at_source,
best_finalized_source_header_at_target,
current_headers_difference,
maximal_headers_difference,
required_header_number,
);
The source chain (left) is Pangolin The target chain (right) is Pangoro
Because I just want test pangoro(right) -> pangolin(left), so I removed pangolin(left) -> pangoro(right) messages-relay (I don’t want to be affected by irrelevant logs) (And I'm also tested include this direction bridge).
After doing this, when I start bridger (start messages-relay). The logs output like this
[2021-09-20T06:47:06Z DEBUG substrate_relay_helper::on_demand_headers] Pangoro *1 scan range, best_finalized_source_header_at_source: 287918, best_finalized_source_header_at_target: 267035, current_headers_difference: 20883, maximal_headers_difference: 100, required_header_number: 0
[2021-09-20T06:47:06Z DEBUG substrate_relay_helper::on_demand_headers] Pangolin *1 scan range, best_finalized_source_header_at_source: 555021, best_finalized_source_header_at_target: 488448, current_headers_difference: 66573, maximal_headers_difference: 100, required_header_number: 555021
The first times:
pangolin(source) -> pangoro(target) (Pangoro::Develivery) headers-relay required_header_number
started from 0
pangoro(source) -> pangolin(target) (Pangolin::ReceivingConfirmationsDelivery) headers-relay required_header_number
started from 555021 not 0.
So I think the messages-relay is early started and updated required_header_number
by query source chain state. then headers-relay will never pass this condition. only when best_finalized_source_header_at_target
>= required_header_number
will continue to work, and then this is difficult to do, it needs finality_loop
to scan one by one and complete it synchronously.
I'm test add std::thread::sleep(std::time::Duration::from_secs(5));
to https://github.com/darwinia-network/bridger/blob/96cf8321235bf0a0ed4b69e01a6e95e9dca42d4b/task/task-pangolin-pangoro/src/service/relay.rs#L149
pangoro(source) -> pangolin(target) (Pangolin::ReceivingConfirmationsDelivery) headers-relay required_header_number
will start from 0. and then messages-relay started. will update required_header_number
to 555021 (But wait for the next loop to actually use this value, before that, will call find_mandatory_header_in_range
https://github.com/darwinia-network/parity-bridges-common/blob/7be2b478b49bc8e87ddfe034076e9e80e2f126f6/relays/lib-substrate-relay/src/on_demand_headers.rs#L390-L412 ).
So I think the messages-relay is early started and updated required_header_number by query source chain state. then headers-relay will never pass this condition. only when best_finalized_source_header_at_target >= required_header_number will continue to work, and then this is difficult to do, it needs finality_loop to scan one by one and complete it synchronously.
In addition to syncing headers that messages relay needs, on-demand-relayer also awakes from time to time and syncs all mandatory headers. These are headers that are changing GRANDPA authority sets && bridges GRANDPA pallet cannot proceed without such headers. That's what all the code below https://github.com/darwinia-network/parity-bridges-common/blob/7be2b478b49bc8e87ddfe034076e9e80e2f126f6/relays/lib-substrate-relay/src/on_demand_headers.rs#L180 is about.
So in your case when the code reaches the mandatory_headers_scan_range
, messages relay has already requested to sync some descendant of mandatory header => that's why mandatory_headers_scan_range
will simply return None
. If you add pause, it'll be executing first && it will select this header for submission. But then after pause, messages relay will require descendant of this mandatory header => it'll also be synced. So the outcome is the same - both headers will be synced.
I'm confused about there.
finality_pipeline
https://github.com/paritytech/parity-bridges-common/blob/master/relays/lib-substrate-relay/src/finality_pipeline.rs
on_demand_headers
https://github.com/paritytech/parity-bridges-common/blob/master/relays/lib-substrate-relay/src/on_demand_headers.rs
There run in a loop https://github.com/paritytech/parity-bridges-common/blob/6d0daa8975a57c23b6705a725b18bc4960547d42/relays/lib-substrate-relay/src/on_demand_headers.rs#L146 and finally will call
finality_relay::run
https://github.com/paritytech/parity-bridges-common/blob/6d0daa8975a57c23b6705a725b18bc4960547d42/relays/lib-substrate-relay/src/on_demand_headers.rs#L228-L242MessageTarget (in messages relay) will update
require_finalized_header
https://github.com/paritytech/parity-bridges-common/blob/6d0daa8975a57c23b6705a725b18bc4960547d42/relays/lib-substrate-relay/src/messages_target.rs#L237-L241But I did debug here,
finality_relay::run
just called once. https://github.com/paritytech/parity-bridges-common/blob/6d0daa8975a57c23b6705a725b18bc4960547d42/relays/lib-substrate-relay/src/on_demand_headers.rs#L229So, why need this file, and why need update
require_finalized_header
, will this affect finality_relay?In addition, what is the difference between
finality_pipeline