cosmos / relayer-archive

An example of a server side IBC relayer to be used for Game of Zones and beyond
56 stars 31 forks source link

Issue with update client #27

Closed jackzampolin closed 4 years ago

jackzampolin commented 4 years ago

When trying to update an existing client the following error is output

$ relayer tx clients ibc0 ibc1 ibconeclient ibczeroclient
$ relayer tx update-client ibc0 ibc1 ibconeclient
{"height":"0","txhash":"CCB5F99DA32E4B4273F1DED9DB0330F53C14B18E3F61CE763053FF1D0FE34E81","codespace":"client","code":9,"raw_log":"invalid block header: signedHeader belongs to another chain 'ibc1' not 'ibc0': cannot update client with ID ibconeclient","gas_wanted":"200000","gas_used":"46709"}
$ relayer tx update-client ibc1 ibc0 ibczeroclient
{"height":"0","txhash":"0DE65516AE1C960792D3C4D4E24F21BB62C372FB9A47D07E16ECCC1CE23C96BA","codespace":"client","code":9,"raw_log":"invalid block header: signedHeader belongs to another chain 'ibc0' not 'ibc1': cannot update client with ID ibczeroclient","gas_wanted":"200000","gas_used":"46722"}
cwgoes commented 4 years ago

Interesting, that looks like an error that's being thrown from the verify function, cc @melekes.

I believe the relevant code is here. Possibly the client is being incorrectly created by the IBC module.

melekes commented 4 years ago

You can't switch chainID of existing lite2.Client client => if you want to change chainID, you'll need to create another instance of lite2.Client.

jackzampolin commented 4 years ago

@melekes I don't think thats whats the issue is. I've read over the code a number of times, but could be wrong.

melekes commented 4 years ago

maybe we're using the same DB for both dst and src light clients? Can you change NewLiteDB func :

func (c *Chain) NewLiteDB() (db *dbm.GoLevelDB, df func(), err error) {

    db, err = dbm.NewGoLevelDB(c.ChainID, filepath.Join(liteDir(c.HomePath), c.ChainID))
jackzampolin commented 4 years ago

@melekes it looks like we are using separate DBs:

/tmp/tmp.NBIXLsN4z0
├── config
│   └── config.yaml
├── keys
│   ├── keyring-test-ibc0
│   │   ├── cosmos1ku0t2496a74qd2jgtxgq0speq006xmr9js84f2.address
│   │   └── testkey.info
│   └── keyring-test-ibc1
│       ├── cosmos1sl5lx4ty4mree2dz3tf8mfqzy550zuf5dccp87.address
│       └── testkey.info
└── lite
    ├── ibc0.db
    │   ├── 000001.log
    │   ├── CURRENT
    │   ├── LOCK
    │   ├── LOG
    │   └── MANIFEST-000000
    └── ibc1.db
        ├── 000001.log
        ├── CURRENT
        ├── LOCK
        ├── LOG
        └── MANIFEST-000000
melekes commented 4 years ago

never mind then...

cwgoes commented 4 years ago

I don't really see how the database is relevant - this is the error being thrown by the on-chain light client when we are calling Verify, correct? Otherwise it wouldn't show up in the error "raw_log" like this (if it were some light client running locally) - I'm not sure if the relayer runs a light client yet anyways. Maybe we're somehow using the client ID as the chain ID (we shouldn't be)?

cwgoes commented 4 years ago

Yeah, OK, this code is wrong:

    switch clientType {
    case exported.Tendermint:
        clientState, consensusState, err = tendermint.CheckValidityAndUpdateState(clientState, header, ctx.ChainID())
    default:
        return sdkerrors.Wrapf(types.ErrInvalidClientType, "cannot update client with ID %s", clientID)
    }

It should definitely not be ctx.ChainID(), that's this chain's ID, we want the ID for the chain which this light client is tracking. I suppose this can be stored in the ClientState.

cc @AdityaSripal Did you fix this already in the ICS 7 follow-up PR (https://github.com/cosmos/cosmos-sdk/pull/5631)? If not, we should do so.

jackzampolin commented 4 years ago

If we store chain-id in ClientState building automation becomes way easier

cwgoes commented 4 years ago

If we store chain-id in ClientState building automation becomes way easier

Aye, I think we will, as we need it for Verify in any case.

cwgoes commented 4 years ago

This has since been fixed (thanks @AdityaSripal).

jackzampolin commented 4 years ago

Thats great news @cwgoes!

jackzampolin commented 4 years ago

Still can't update clients so reopening.

cwgoes commented 4 years ago

Still can't update clients so reopening.

Really, what is the error? This one has been fixed for me, at least running locally.

jackzampolin commented 4 years ago

https://github.com/cosmos/relayer/pull/37/files/435568d34041535de8efcc439f4249fd7bed8bce#diff-eaf613a3c38d66fbb6d4a12dd6ab568c

jackzampolin commented 4 years ago

We still don't have working clients afaik. Is this blocked on https://github.com/cosmos/relayer/issues/27 cc @melekes

avkr003 commented 4 years ago

Hi all! I was going through this issue and was trying to replicate/debug it. The bug I found is in the client (submodule) of IBC module in cosmos-sdk which is not as per the ICS algorithm.

In line 84 and 85 (https://github.com/cosmos/cosmos-sdk/blob/68f6700b37e7ef47bac136354d0c127112ae8c89/x/ibc/02-client/keeper/client.go#L84)

clientState, consensusState, err = tendermint.CheckValidityAndUpdateState( clientState, header, ctx.BlockTime() )

What ICS says is to check whether the time stamp of the header from dst chain should be less than the current time (assert(header.timestamp <= currentTimestamp()) https://github.com/cosmos/ics/tree/master/spec/ics-007-tendermint-client#validity-predicate) but what the code is checking whether this dst chain header timestamp is less than src chain latest block time (in the above code ctx.BlockTime() ).

This means update-client transaction will happen only between the time when there is new block added to src chain but not to the dst chain then the only header of dst chain will have timestamp less than src chain latest block time (tested this and it works this way).

The correct code should be, as per the ICS, instead of ctx.BlockTime() it should be time.Now() which does assert(header.timestamp <= currentTimestamp()) . I tested update-client tx with time.Now() and it works as it should.

Awaiting your review.

cwgoes commented 4 years ago

The correct code should be, as per the ICS, instead of ctx.BlockTime() it should be time.Now() which does assert(header.timestamp <= currentTimestamp()) . I tested update-client tx with time.Now() and it works as it should.

We can't use time.Now() because it is subjective, so it will cause consensus divergence. ctx.BlockTime() is the only accessible source of time. It's true that it may impose this slight delay requirement, I think it would safe to add a grace period (say five-ten seconds or so).

avkr003 commented 4 years ago

Oh yes, time.Now() will make it subjective. My bad. I agree, using ctx.BlockTime() + 5 or 10s as currentTime will be a good idea!

cwgoes commented 4 years ago

@jackzampolin I'm still not sure what I should be trying to replicate here. It's true that if the two chains are out-of-sync time-wise, you might have to wait a few seconds before submitting headers produced by one chain to the other chain (see my comment above). Is that the error you're getting here (timestamps in the future)? If so, we can try adding a grace period. If it's another error, there's probably a different issue.

jackzampolin commented 4 years ago

I think this has FINALLY been solved in the jack/debuggin branch.