informalsystems / hermes

IBC Relayer in Rust
https://hermes.informal.systems
Apache License 2.0
442 stars 326 forks source link

Duplicate transactions submitted #1663

Closed adizere closed 2 years ago

adizere commented 2 years ago

Problem summary

This may be a bug in the Hermes logic for process_pending.

Dec  9 07:34:25 Beast hermes[23669]: 2021-12-09T07:34:25.331508Z  INFO ThreadId(40805) [osmosis-1:transfer/channel-87 -> comdex-1] generate messages from batch with 1 events
Dec  9 07:34:25 Beast hermes[23669]: 2021-12-09T07:34:25.333812Z  INFO ThreadId(40805) [osmosis-1:transfer/channel-87 -> comdex-1] scheduling op. data with 1 msg(s) for Destination (height 1-2314603)
Dec  9 07:34:25 Beast hermes[23669]: 2021-12-09T07:34:25.334605Z  INFO ThreadId(40805) [osmosis-1:transfer/channel-87 -> comdex-1] relay op. data of 1 msgs(s) to Destination (height 1-2314603), delayed by: 787.1
67µs [try 1/5]
Dec  9 07:34:25 Beast hermes[23669]: 2021-12-09T07:34:25.334611Z  INFO ThreadId(40805) [osmosis-1:transfer/channel-87 -> comdex-1] prepending Destination client update @ height 1-2314603
Dec  9 07:34:25 Beast hermes[23669]: 2021-12-09T07:34:25.366291Z  INFO ThreadId(40805) [osmosis-1:transfer/channel-87 -> comdex-1] assembled batch of 2 message(s)
Dec  9 07:34:25 Beast hermes[23669]: 2021-12-09T07:34:25.378393Z  INFO ThreadId(40805) [Async~>comdex-1] response(s): 1; Ok:A5129724DBE2BCFD87BF02CFB1ED01BE21D1BD67B73DD4E640881C34D50AC14E
Dec  9 07:34:25 Beast hermes[23669]: 2021-12-09T07:34:25.378412Z  INFO ThreadId(40805) [osmosis-1:transfer/channel-87 -> comdex-1] success
Dec  9 07:36:05 Beast hermes[23669]: 2021-12-09T07:36:05.405522Z ERROR ThreadId(40805) [comdex-1:transfer/channel-1 -> osmosis-1] timed out while confirming TxHashes: count=1; A5129724DBE2BCFD87BF02CFB1ED01BE21D
1BD67B73DD4E640881C34D50AC14E
Dec  9 07:36:05 Beast hermes[23669]: 2021-12-09T07:36:05.407613Z  INFO ThreadId(40805) [osmosis-1:transfer/channel-87 -> comdex-1] relay op. data of 1 msgs(s) to Destination (height 1-2314603), delayed by: 100.0
73794587s [try 1/5]
Dec  9 07:36:05 Beast hermes[23669]: 2021-12-09T07:36:05.407624Z  INFO ThreadId(40805) [osmosis-1:transfer/channel-87 -> comdex-1] prepending Destination client update @ height 1-2314603
Dec  9 07:36:05 Beast hermes[23669]: 2021-12-09T07:36:05.408626Z  WARN ThreadId(40805) [osmosis-1 -> comdex-1:07-tendermint-1] resolving trusted height from the full list of consensus state heights for target he
ight 1-2314603; this may take a while
Dec  9 07:36:06 Beast hermes[23669]: 2021-12-09T07:36:06.033219Z  INFO ThreadId(40805) [osmosis-1:transfer/channel-87 -> comdex-1] assembled batch of 2 message(s)
Dec  9 07:36:06 Beast hermes[23669]: 2021-12-09T07:36:06.041192Z  INFO ThreadId(40805) [Async~>comdex-1] response(s): 1; Ok:5315E04F87880471CDFE49FF1071431D48CCC3232BEF4648F2AEE655BE62BFE0
Dec  9 07:36:06 Beast hermes[23669]: 2021-12-09T07:36:06.041225Z  INFO ThreadId(40805) [osmosis-1:transfer/channel-87 -> comdex-1] success

These are logs reported in this comment below. The following happens:

  1. Hermes submits a tx to chain comdex-1, hash A5129724DBE2BCFD87BF02CFB1ED01BE21D1BD67B73DD4E640881C34D50AC14E
    • timestamp = 07:34:25.378393Z
    • but the tx was in fact included in a block
  2. Hermes fails to find confirmations for this transaction hash
    • ts = 07:36:05.405522Z -- which is ~100 seconds later,
  3. Then it resubmits the same transaction to comdex-1, this time with hash 5315E04F87880471CDFE49FF1071431D48CCC3232BEF4648F2AEE655BE62BFE0
    • ts = 07:36:06.041192Z

Acceptance criteria

The problem in the present issue is related to #1792. If that issue is fixed, that will help with fixing this issue, but we should nevertheless increase the timeout.

Original issue description

Two separate relayer operators reported that Hermes submits duplicate transactions.

two same txn in one block: https://www.mintscan.io/iris/txs/F388151AF2817A8B3F6948455D80E914B7D0459C4D28960D36694378C910D047 https://www.mintscan.io/iris/txs/1E8C05E7D77A864018AFC6154C7C07DBDA003F723526A2B9F6A70C831A7AE003

it's normal situation?

Originally posted by @dualsystems in https://github.com/informalsystems/ibc-rs/issues/1608#issuecomment-987389966


From a Telegram group:

Any idea why we are relaying same transaction twice? https://www.mintscan.io/comdex/txs/A5129724DBE2BCFD87BF02CFB1ED01BE21D1BD67B73DD4E640881C34D50AC14E https://www.mintscan.io/comdex/txs/5315E04F87880471CDFE49FF1071431D48CCC3232BEF4648F2AEE655BE62BFE0

It's unclear how to replicate this, so some Hermes logs would be useful in debugging. If any relayer operator is open to sharing their logs with us, we would be grateful!

Galadrin commented 2 years ago

Here is the logs around the event on Cros-nest's relayer. We are running Hermes with INFO log level

hermes.log

mircea-c commented 2 years ago

This is not a bug. It's the same as JUNO chain. One of the transactions is a "no-op". In mintscan it shows both as "success", but only one of them actually executed.

dualsystems commented 2 years ago

This is not a bug. It's the same as JUNO chain. One of the transactions is a "no-op". In mintscan it shows both as "success", but only one of them actually executed.

One instance of hermes send two identical txn in one block. Signer is same. In some part it is a bug tendermint in CheckTX. But another side why hermes send two txn? I'll make a log and attach to the topic.