cosmos / relayer

An IBC relayer for ibc-go
Apache License 2.0
390 stars 1.71k forks source link

Make Batch delay configurable #1374

Closed ertemann closed 8 months ago

ertemann commented 11 months ago

Can we make the hardcoded 30s value RLY waits before processing the batch of TXs a configurable value per chain?

https://github.com/cosmos/relayer/blob/820caf5ab483e92dd3c6b5a7efe2bdd06dc835e3/relayer/relayMsgs.go#L31

ertemann commented 9 months ago

@jtieri i did some more digging and this indeed seems to be important to reduce the number of "uneffected packets" that RLY emits.

RLY basically double commits the relaying of many packets by pushing both valid and already relayed packets in the batch. This costs gas but doesnt actually relay anything.

You can check Strangelove and Lavender.Five performance here to see the impact: https://insights.informal.systems/dydx/noble

The fact that RLY forces the packets through in the batch and doesnt fail if one is already committed is a good feature, it not filtering whatsoever however seems like an issue for total gas consumption.

jtieri commented 9 months ago

thanks for opening this issue @ertemann, i can definitely make this value configurable on a per-chain basis as you requested.

if i'm not mistaken this value is only used in the legacy relaying processor, is there a reason your team is not using the events processor?

dylanschultzie commented 9 months ago

@jtieri can you expand on that please?

jtieri commented 9 months ago

@jtieri can you expand on that please?

Of course! The relayer supports two processors which are essentially algorithms or strategies for relaying packets. The older strategy was query intensive and not completely event based, the newer strategy (which we call the events strategy) is event based and stateful so that we can manage packet lifecycles and reduce the total round trips for the involved queries.

The variable in question here, batchSendMessageTimeout, seems to only be used in the legacy processor where it is referenced when sending msgs related to acks or recv packet msgs.

The variable is used to create a context with timeout in send. https://github.com/cosmos/relayer/blob/4c416504efb360c3e47606b69f76ba5b2f85c9f8/relayer/relayMsgs.go#L214 https://github.com/cosmos/relayer/blob/4c416504efb360c3e47606b69f76ba5b2f85c9f8/relayer/relayMsgs.go#L236

Which then gets called in Send. https://github.com/cosmos/relayer/blob/4c416504efb360c3e47606b69f76ba5b2f85c9f8/relayer/relayMsgs.go#L153

Finally Send is called in these two functions. https://github.com/cosmos/relayer/blob/4c416504efb360c3e47606b69f76ba5b2f85c9f8/relayer/naive-strategy.go#L385 https://github.com/cosmos/relayer/blob/4c416504efb360c3e47606b69f76ba5b2f85c9f8/relayer/naive-strategy.go#L473

Since v2.0.0 the events processor is the default relaying strategy in the relayer, so the code path that involves this batchSendMessageTimeout variable is only reached if you explicitly use --processor legacy in rly tx start or rly tx link-then-start.

The legacy processor will likely be removed in a future v3.0.0 release since it is inefficient and is no longer utilized internally unless a user explicitly starts the relayer in the legacy mode.

jtieri commented 8 months ago

Since we have determined this value is only used in the legacy processor and we plan to remove this code path in the near future, I'm going to close this issue. Feel free to re-open if you think this was a mistake.

ertemann commented 8 months ago

I think you are right, something else is the reason for the uneffected packet part.