Closed Fraccaman closed 3 years ago
Something is likely going wrong with the commands that configure the light clients (ie. the light add
commands).
I see that the script adds secondary peers for each node using the same network address, is that intended?
Could you remove the &>/dev/null
redirect at the end of the commands in ibc.sh
and report back with the output?
Yes, you are right, that network address is wrong. So this is the output (i fixed also a typo):
Building the Rust relayer...
Removing light client peers from configuration...
Adding primary peers to light client configuration...
Finished dev [unoptimized + debuginfo] target(s) in 0.18s
Running `target/debug/hermes -c /home/ec2-user/.hermes/config.toml light add 'localhost:26657' -c stargate -f -p -s /home/ec2-user/node-stargate/data -y`
Success Added light client:
chain id: stargate
address: tcp://localhost:26657
peer id: 648A550A0545AF774223E556117BBDE3156A3520
height: 5229734
hash: CF9104D58D3FE7A35E062C02C09E52EA062C5A8F212DF5DDA358EE9A52450F84
primary: true
Finished dev [unoptimized + debuginfo] target(s) in 0.16s
Running `target/debug/hermes -c /home/ec2-user/.hermes/config.toml light add 'localhost:26357' -c heliax -f -p -s /home/ec2-user/node-heliax/data -y`
Success Added light client:
chain id: heliax
address: tcp://localhost:26357
peer id: 97DAF05D3D5CCE2DF5AC3323784C2C01A1B7D5CA
height: 6904
hash: 81995B52593A2F3D1629A31A20CAF034F9AB64A6235D34457C4F36B065FFD439
primary: true
Adding secondary peers to light client configuration...
Finished dev [unoptimized + debuginfo] target(s) in 0.16s
Running `target/debug/hermes -c /home/ec2-user/.hermes/config.toml light add 'localhost:26657' -c stargate -s /home/ec2-user/node-stargate/data -y --peer-id 2427F8D914A6862279B3326FA64F76E3BC06DB2E`
Success Added light client:
chain id: stargate
address: tcp://localhost:26657
peer id: 2427F8D914A6862279B3326FA64F76E3BC06DB2E
height: 5229737
hash: 6D874A3D9167B8C955D9AF512C20D7B7069260B831B5F1B338FC77F430AC317E
primary: false
Finished dev [unoptimized + debuginfo] target(s) in 0.16s
Running `target/debug/hermes -c /home/ec2-user/.hermes/config.toml light add 'localhost:26357' -c heliax -s /home/ec2-user/node-heliax/data -y --peer-id A885BB3D3DFF6101188B462466AE926E7A6CD51E`
Success Added light client:
chain id: heliax
address: tcp://localhost:26357
peer id: A885BB3D3DFF6101188B462466AE926E7A6CD51E
height: 6905
hash: 6AED5CE877589AE582C5DF9881F008A7CF371E1DCB0F7F03DD1B81FF5A4E71ED
primary: false
Importing keys...
Finished dev [unoptimized + debuginfo] target(s) in 0.16s
Running `target/debug/hermes -c /home/ec2-user/.hermes/config.toml keys add stargate /home/ec2-user/node-stargate/key_seed.json`
{"status":"success","result":"Added key node_key (cosmos1ztu56h7kpuguf9y39lhxgayhngmysnxsgl8f9u) on stargate chain"}
Finished dev [unoptimized + debuginfo] target(s) in 0.16s
Running `target/debug/hermes -c /home/ec2-user/.hermes/config.toml keys add heliax /home/ec2-user/node-heliax/key_seed.json`
{"status":"success","result":"Added key node_key (cosmos196hkxg7c53h6u75umdrhwum6kp8xyxzw2kvu7r) on heliax chain"}
Done!
Now, the error changed and is the following:
{"status":"error","result":"chain runtime/handle error: Light client instance error for rpc address tcp://localhost:26657: invalid light block: invalid validator set: header_validators_hash=862A9C43A9A29FC6D508352B056A738DB35B3F96F0FA02F0DA2FC1ED8035A55C validators_hash=0198C4156F82C8E0B11C23A24F43FEDE7D92D9146E64FD5D37C5ED3360F53AA9"}
Can you post your full ~/.relayer/config.toml
file after running the script?
If you mean the ~/.hermes/config.toml
here it is:
[global]
timeout = '10s'
strategy = 'naive'
log_level = 'error'
[[chains]]
id = 'stargate'
rpc_addr = 'tcp://localhost:26657'
grpc_addr = 'tcp://localhost:9090'
account_prefix = 'cosmos'
key_name = 'node_key'
store_prefix = 'stargate'
gas = 3000000
clock_drift = '5s'
trusting_period = '14days'
[chains.trust_threshold]
numerator = '1'
denominator = '3'
[chains.peers]
primary = '648A550A0545AF774223E556117BBDE3156A3520'
[[chains.peers.light_clients]]
peer_id = '648A550A0545AF774223E556117BBDE3156A3520'
address = 'tcp://localhost:26657'
timeout = '10s'
trusted_header_hash = 'CF9104D58D3FE7A35E062C02C09E52EA062C5A8F212DF5DDA358EE9A52450F84'
trusted_height = '5229734'
[chains.peers.light_clients.store]
type = 'disk'
path = '/home/ec2-user/node-stargate/data/648A550A0545AF774223E556117BBDE3156A3520'
[[chains.peers.light_clients]]
peer_id = '2427F8D914A6862279B3326FA64F76E3BC06DB2E'
address = 'tcp://localhost:26657'
timeout = '10s'
trusted_header_hash = '6D874A3D9167B8C955D9AF512C20D7B7069260B831B5F1B338FC77F430AC317E'
trusted_height = '5229737'
[chains.peers.light_clients.store]
type = 'disk'
path = '/home/ec2-user/node-stargate/data/2427F8D914A6862279B3326FA64F76E3BC06DB2E'
[[chains]]
id = 'heliax'
rpc_addr = 'tcp://localhost:26357'
grpc_addr = 'tcp://localhost:9091'
account_prefix = 'cosmos'
key_name = 'node_key'
store_prefix = 'heliax'
gas = 3000000
clock_drift = '5s'
trusting_period = '14days'
[chains.trust_threshold]
numerator = '1'
denominator = '3'
[chains.peers]
primary = '97DAF05D3D5CCE2DF5AC3323784C2C01A1B7D5CA'
[[chains.peers.light_clients]]
peer_id = '97DAF05D3D5CCE2DF5AC3323784C2C01A1B7D5CA'
address = 'tcp://localhost:26357'
timeout = '10s'
trusted_header_hash = '81995B52593A2F3D1629A31A20CAF034F9AB64A6235D34457C4F36B065FFD439'
trusted_height = '6904'
[chains.peers.light_clients.store]
type = 'disk'
path = '/home/ec2-user/node-heliax/data/97DAF05D3D5CCE2DF5AC3323784C2C01A1B7D5CA'
[[chains.peers.light_clients]]
peer_id = 'A885BB3D3DFF6101188B462466AE926E7A6CD51E'
address = 'tcp://localhost:26357'
timeout = '10s'
trusted_header_hash = '6AED5CE877589AE582C5DF9881F008A7CF371E1DCB0F7F03DD1B81FF5A4E71ED'
trusted_height = '6905'
[chains.peers.light_clients.store]
type = 'disk'
path = '/home/ec2-user/node-heliax/data/A885BB3D3DFF6101188B462466AE926E7A6CD51E'
If you mean the
~/.hermes/config.toml
here it is:
Yes it's what I meant, sorry about that! Your config looks good to me.
The light client is throwing this error when verifying the initial trusted lightblock, and gets a mismatch between the hash of validator set stored in the header and the hash of the validator set for that height that it computes.
I am not sure what could cause that. Maybe a mismatch in the Tendermint version that the nodes are running vs the version supported by tendermint-rs? Can you tell me what version of Tendermint the nodes are running?
No worries @romac! Can you tell me how can I check that?
I can give you the output of gaiad version --long
(hope this is enough):
name: gaia
server_name: gaiad
version: 4.0.0
commit: a279d091c6f66f8a91c87943139ebaecdd84f689
build_tags: netgo,ledger
go: go version go1.15.8 linux/amd64
build_deps:
- github.com/99designs/keyring@v1.1.6
- github.com/ChainSafe/go-schnorrkel@v0.0.0-20200405005733-88cbf1b4c40d
- github.com/Workiva/go-datastructures@v1.0.52
- github.com/aristanetworks/goarista@v0.0.0-20170210015632-ea17b1a17847
- github.com/armon/go-metrics@v0.3.6
- github.com/beorn7/perks@v1.0.1
- github.com/bgentry/speakeasy@v0.1.0
- github.com/btcsuite/btcd@v0.21.0-beta
- github.com/btcsuite/btcutil@v1.0.2
- github.com/cespare/xxhash/v2@v2.1.1
- github.com/confio/ics23/go@v0.6.3
- github.com/cosmos/cosmos-sdk@v0.41.0
- github.com/cosmos/go-bip39@v1.0.0
- github.com/cosmos/iavl@v0.15.3
- github.com/cosmos/ledger-cosmos-go@v0.11.1
- github.com/cosmos/ledger-go@v0.9.2
- github.com/davecgh/go-spew@v1.1.1
- github.com/dvsekhvalnov/jose2go@v0.0.0-20200901110807-248326c1351b
- github.com/enigmampc/btcutil@v1.0.3-0.20200723161021-e2fb6adb2a25
- github.com/ethereum/go-ethereum@v1.9.25
- github.com/felixge/httpsnoop@v1.0.1
- github.com/fsnotify/fsnotify@v1.4.9
- github.com/go-kit/kit@v0.10.0
- github.com/go-logfmt/logfmt@v0.5.0
- github.com/godbus/dbus@v0.0.0-20190726142602-4481cbc300e2
- github.com/gogo/gateway@v1.1.0
- github.com/gogo/protobuf@v1.3.3 => github.com/regen-network/protobuf@v1.3.3-alpha.regen.1
- github.com/golang/protobuf@v1.4.3
- github.com/golang/snappy@v0.0.3-0.20201103224600-674baa8c7fc3
- github.com/google/btree@v1.0.0
- github.com/gorilla/handlers@v1.5.1
- github.com/gorilla/mux@v1.8.0
- github.com/gorilla/websocket@v1.4.2
- github.com/grpc-ecosystem/go-grpc-middleware@v1.2.2
- github.com/grpc-ecosystem/grpc-gateway@v1.16.0
- github.com/gsterjov/go-libsecret@v0.0.0-20161001094733-a6f4afe4910c
- github.com/gtank/merlin@v0.1.1
- github.com/gtank/ristretto255@v0.1.2
- github.com/hashicorp/go-immutable-radix@v1.0.0
- github.com/hashicorp/golang-lru@v0.5.4
- github.com/hashicorp/hcl@v1.0.0
- github.com/libp2p/go-buffer-pool@v0.0.2
- github.com/magiconair/properties@v1.8.4
- github.com/mattn/go-isatty@v0.0.12
- github.com/matttproud/golang_protobuf_extensions@v1.0.1
- github.com/mimoo/StrobeGo@v0.0.0-20181016162300-f8f6d4d2b643
- github.com/minio/highwayhash@v1.0.1
- github.com/mitchellh/go-homedir@v1.1.0
- github.com/mitchellh/mapstructure@v1.1.2
- github.com/mtibben/percent@v0.2.1
- github.com/pelletier/go-toml@v1.8.0
- github.com/pkg/errors@v0.9.1
- github.com/pmezard/go-difflib@v1.0.0
- github.com/prometheus/client_golang@v1.8.0
- github.com/prometheus/client_model@v0.2.0
- github.com/prometheus/common@v0.15.0
- github.com/prometheus/procfs@v0.2.0
- github.com/rakyll/statik@v0.1.7
- github.com/rcrowley/go-metrics@v0.0.0-20200313005456-10cdbea86bc0
- github.com/regen-network/cosmos-proto@v0.3.1
- github.com/rs/cors@v1.7.0
- github.com/rs/zerolog@v1.20.0
- github.com/spf13/afero@v1.3.4
- github.com/spf13/cast@v1.3.1
- github.com/spf13/cobra@v1.1.1
- github.com/spf13/jwalterweatherman@v1.1.0
- github.com/spf13/pflag@v1.0.5
- github.com/spf13/viper@v1.7.1
- github.com/stretchr/testify@v1.7.0
- github.com/subosito/gotenv@v1.2.0
- github.com/syndtr/goleveldb@v1.0.1-0.20200815110645-5c35d600f0ca
- github.com/tendermint/btcd@v0.1.1
- github.com/tendermint/crypto@v0.0.0-20191022145703-50d29ede1e15
- github.com/tendermint/go-amino@v0.16.0
- github.com/tendermint/tendermint@v0.34.3
- github.com/tendermint/tm-db@v0.6.3
- github.com/zondax/hid@v0.9.0
- golang.org/x/crypto@v0.0.0-20201221181555-eec23a3978ad
- golang.org/x/net@v0.0.0-20201021035429-f5854403a974
- golang.org/x/sys@v0.0.0-20201015000850-e3ed0017c211
- golang.org/x/term@v0.0.0-20201117132131-f5c789dd3221
- golang.org/x/text@v0.3.3
- google.golang.org/genproto@v0.0.0-20210114201628-6edceaf6022f
- google.golang.org/grpc@v1.35.0
- google.golang.org/protobuf@v1.25.0
- gopkg.in/ini.v1@v1.51.0
- gopkg.in/yaml.v2@v2.4.0
- gopkg.in/yaml.v3@v3.0.0-20200313102051-9f266ea9e77c
The Tendermint version looks good. We are going to try to reproduce the issue on our side and get back to you. /cc @andynog
Hi @Fraccaman, just following up on this. For the stargate chain, is this a local chain you're running or are you testing against a testnet ?
One node is running stargate mainnet, the other a local chain.
When you say your node is running on Stargate mainnet, are you referring to cosmoshub-4 ? Can you please send what you get if you run this API query https://localhost:26657/status
Just want to ensure we're talking about the same thing :-)
Sorry for the late response, here is the result:
{
"jsonrpc": "2.0",
"id": -1,
"result": {
"node_info": {
"protocol_version": {
"p2p": "8",
"block": "11",
"app": "0"
},
"id": "3ed8666e8e7fe0ae4dac31014841c1828d240cc9",
"listen_addr": "tcp://0.0.0.0:26656",
"network": "cosmoshub-4",
"version": "",
"channels": "40202122233038606100",
"moniker": "heliax-1",
"other": {
"tx_index": "on",
"rpc_address": "tcp://127.0.0.1:26657"
}
},
"sync_info": {
"latest_block_hash": "E1B2EE22FF1B025DA902FA6B732D09BBCF7DBF8B2E406A7D65A26BED8D499CAF",
"latest_app_hash": "A99F8C091DD628CA595098368BC57EC69E42EFC1CAAC9D66FA26929756430DD1",
"latest_block_height": "5201056",
"latest_block_time": "2021-02-18T13:08:36.985091499Z",
"earliest_block_hash": "1455A0C15AC49BB506992EC85A3CD4D32367E53A087689815E01A524231C3ADF",
"earliest_app_hash": "E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855",
"earliest_block_height": "5200791",
"earliest_block_time": "2019-12-11T16:11:34Z",
"catching_up": true
},
"validator_info": {
"address": "84B23508421F2B0C20F1C09308D25F0F6E8AEB41",
"pub_key": {
"type": "tendermint/PubKeyEd25519",
"value": "Abh8gRTD4ZEu5ytdlb+OSkrX7DiRt8vtiPvCLUuQjOY="
},
"voting_power": "0"
}
}
}
Hi @Fraccaman, we believe this bug is fixed now. We did encounter a bug and fixed in master. We could reproduce it last week and made modifications in https://github.com/informalsystems/tendermint-rs/issues/831
Please let us know if this is fixed now so we can close the ticket. Thanks!
Ill try again asap and report back! Thanks! :)
Thanks @Fraccaman please keep us posted. :+1:
So, I tried again (same setup). Now I get the following error:
{"status":"error","result":"chain runtime/handle error: Light client instance error for rpc address tcp://localhost:26657: invalid light block: not withing trusting period: expires_at=2021-03-04T15:47:08.9667102Z now=2021-04-07T15:16:02.429079517Z"}
So, I tried again (same setup). Now I get the following error:
{"status":"error","result":"chain runtime/handle error: Light client instance error for rpc address tcp://localhost:26657: invalid light block: not withing trusting period: expires_at=2021-03-04T15:47:08.9667102Z now=2021-04-07T15:16:02.429079517Z"}
Hi @Fraccaman, you might need to update your light client (primary and witness) if you haven't done so. It's the same command to add so run again. This will update the trusted header and height
hermes light add tcp://localhost:26657 -c stargate ...
I started from a new machine, so I don't think I needed to update the trusted headers. Anyway, I tried rerunning hermes again (you can see the scripts here), and its kinda strage. Sometime I get the same error about the trusting period but sometime I get
{"status":"error","result":"chain runtime/handle error: Light client instance error for rpc address tcp://localhost:26657: I/O error: fetched validator set is invalid: proposer with address 'C2356622B495725961B5B201A382DD57CD3305EC' not found in validator set"}
I started from a new machine, so I don't think I needed to update the trusted headers. Anyway, I tried rerunning hermes again (you can see the scripts here), and its kinda strage. Sometime I get the same error about the trusting period but sometime I get
{"status":"error","result":"chain runtime/handle error: Light client instance error for rpc address tcp://localhost:26657: I/O error: fetched validator set is invalid: proposer with address 'C2356622B495725961B5B201A382DD57CD3305EC' not found in validator set"}
That's strange is this setup a clean one?
Yes, clean setup, same set of scripts.
OK, I might have to try to reproduce that error again. But there's a few changes related to light configuration in the next release (should be coming out soon), so I'd rather test when the new release is out.
I started from a new machine, so I don't think I needed to update the trusted headers. Anyway, I tried rerunning hermes again (you can see the scripts here), and its kinda strage. Sometime I get the same error about the trusting period but sometime I get
{"status":"error","result":"chain runtime/handle error: Light client instance error for rpc address tcp://localhost:26657: I/O error: fetched validator set is invalid: proposer with address 'C2356622B495725961B5B201A382DD57CD3305EC' not found in validator set"}
@romac any ideas on what might cause this error ?
This error happens when the light client fetches the header and the validator set at height H from the chain, where the latter does not contain a validator whose address matches the proposer_address
of the fetched header. It is not clear to me in which circumstances this can happen. As far as I understand, the validator that proposed a block at height H should always be present in the validator set at height H, or at least that's what the code currently enforces.
Fetching the header and validator set:
Building the validator set:
Ensuring there is a validator in the set with a matching address:
What version of hermes? I think this may come from the pagination issue in tendermint rpc (where we were getting an incomplete validator set) that was fixed and picked up in hermes v0.2.0
.
What version of hermes? I think this may come from the pagination issue in tendermint rpc (where we were getting an incomplete validator set) that was fixed and picked up in hermes
v0.2.0
.
Oh right, that's probably it! This was fixed in tendermint v0.19.0 and will therefore indeed be fixed in Hermes v0.2.0.
@andynog @Fraccaman Can you try with Hermes master
and see if the issue does indeed go away?
Im still using 0.1.1 so maybe thats the problem! Yep, I will give it a try!
Uhmm, trying to compile hermes on master
fails (I also tried to compile again v0.1.1 and it works). This are some trouble with openssl:
error: failed to run custom build command for `openssl-sys v0.9.61`
Caused by:
process didn't exit successfully: `/home/ec2-user/ibc-setup/ibc-rs/target/release/build/openssl-sys-3512a973f534ac54/build-script-main` (exit code: 101)
--- stdout
cargo:rustc-cfg=const_fn
cargo:rerun-if-env-changed=X86_64_UNKNOWN_LINUX_GNU_OPENSSL_LIB_DIR
X86_64_UNKNOWN_LINUX_GNU_OPENSSL_LIB_DIR unset
cargo:rerun-if-env-changed=OPENSSL_LIB_DIR
OPENSSL_LIB_DIR unset
cargo:rerun-if-env-changed=X86_64_UNKNOWN_LINUX_GNU_OPENSSL_INCLUDE_DIR
X86_64_UNKNOWN_LINUX_GNU_OPENSSL_INCLUDE_DIR unset
cargo:rerun-if-env-changed=OPENSSL_INCLUDE_DIR
OPENSSL_INCLUDE_DIR unset
cargo:rerun-if-env-changed=X86_64_UNKNOWN_LINUX_GNU_OPENSSL_DIR
X86_64_UNKNOWN_LINUX_GNU_OPENSSL_DIR unset
cargo:rerun-if-env-changed=OPENSSL_DIR
OPENSSL_DIR unset
cargo:rerun-if-env-changed=OPENSSL_NO_PKG_CONFIG
cargo:rerun-if-env-changed=PKG_CONFIG
cargo:rerun-if-env-changed=OPENSSL_STATIC
cargo:rerun-if-env-changed=OPENSSL_DYNAMIC
cargo:rerun-if-env-changed=PKG_CONFIG_ALL_STATIC
cargo:rerun-if-env-changed=PKG_CONFIG_ALL_DYNAMIC
cargo:rerun-if-env-changed=PKG_CONFIG_PATH_x86_64-unknown-linux-gnu
cargo:rerun-if-env-changed=PKG_CONFIG_PATH_x86_64_unknown_linux_gnu
cargo:rerun-if-env-changed=HOST_PKG_CONFIG_PATH
cargo:rerun-if-env-changed=PKG_CONFIG_PATH
cargo:rerun-if-env-changed=PKG_CONFIG_LIBDIR_x86_64-unknown-linux-gnu
cargo:rerun-if-env-changed=PKG_CONFIG_LIBDIR_x86_64_unknown_linux_gnu
cargo:rerun-if-env-changed=HOST_PKG_CONFIG_LIBDIR
cargo:rerun-if-env-changed=PKG_CONFIG_LIBDIR
cargo:rerun-if-env-changed=PKG_CONFIG_SYSROOT_DIR_x86_64-unknown-linux-gnu
cargo:rerun-if-env-changed=PKG_CONFIG_SYSROOT_DIR_x86_64_unknown_linux_gnu
cargo:rerun-if-env-changed=HOST_PKG_CONFIG_SYSROOT_DIR
cargo:rerun-if-env-changed=PKG_CONFIG_SYSROOT_DIR
run pkg_config fail: "`\"pkg-config\" \"--libs\" \"--cflags\" \"openssl\"` did not exit successfully: exit code: 1\n--- stderr\nPackage openssl was not found in the pkg-config search path.\nPerhaps you should add the directory containing `openssl.pc\'\nto the PKG_CONFIG_PATH environment variable\nNo package \'openssl\' found\n"
--- stderr
thread 'main' panicked at '
Could not find directory of OpenSSL installation, and this `-sys` crate cannot
proceed without this knowledge. If OpenSSL is installed and this crate had
trouble finding it, you can set the `OPENSSL_DIR` environment variable for the
compilation process.
Make sure you also have the development packages of openssl installed.
For example, `libssl-dev` on Ubuntu or `openssl-devel` on Fedora.
If you're in a situation where you think the directory *should* be found
automatically, please open a bug at https://github.com/sfackler/rust-openssl
and include information about your system as well as this message.
$HOST = x86_64-unknown-linux-gnu
$TARGET = x86_64-unknown-linux-gnu
openssl-sys = 0.9.61
', /home/ec2-user/.cargo/registry/src/github.com-1ecc6299db9ec823/openssl-sys-0.9.61/build/find_normal.rs:174:5
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
warning: build failed, waiting for other jobs to finish...
Do you have any thoughts on this? The machine I'm using is running Amazon-linux as OS.
Update: I think its a problem with my openssl configuration but I'm not really sure. I have solved this adding to the ibc-relayer-cli
create the openssl dependency (https://docs.rs/crate/openssl-sys/0.9.36). Probably not the best option but it compiles.
So, I'm using the hermer
binary from master but I have problems running the same script as before. It is now complaining about missing subcommands. For example:
cargo run --bin hermes -- -c ~/.hermes/config.toml light rm -c stargate --all -
return
error: unrecognized command `light`
Did you change anything in the command line?
Do you have any thoughts on this? The machine I'm using is running Amazon-linux as OS.
Update: I think its a problem with my openssl configuration but I'm not really sure. I have solved this adding to the ibc-relayer-cli create the openssl dependency (https://docs.rs/crate/openssl-sys/0.9.36). Probably not the best option but it compiles.
To build the tendermint-rpc
crate with TLS support (one of the ibc-rs dependencies), you need to ensure you have the OpenSSL development library installed for your platform. See https://docs.rs/openssl/0.10.33/openssl/index.html#automatic
So, I'm using the
hermer
binary from master but I have problems running the same script as before. It is now complaining about missing subcommands. For example:
My bad, we just merged a PR which removes the need to specify peers for the light client, so we can't test whether the fix works for you via the (now removed) light add
command. You can therefore remove the whole [peers]
section from your configuration file as well as the invocation of the light rm
and light add
commands in your script. You may have to manually update your configuration file as some required options have been added in the meantime. You can take a look at the example config to see which options must now be specified.
The new ones are: websocket_addr
, rpc_timeout
(optional), fee_denom
and fee_amount
.
The next best way to test if things now work correctly would be to create a on-chain client and perform a client update (which will trigger light client verification), by following the instructions at https://hermes.informal.systems/tx_client.html.
Hi @romac! I tried updating the scripts (https://github.com/heliaxdev/ibc-setup) but the link you posted above is a 404 (probably you updated the docs). I followed the docs at https://hermes.informal.systems/tutorials/local-chains/raw/index.html.
Do you mind checking the config and scripts in the ibc
folder? I will try to run it, but it takes a lot of time to start gaiad on cosmoshub-4
network. Thank you!
Update. Tried running it again and now it complains as soon as I launch hermes tx raw create-client $IBC0 $IBC1
to create the first client. The error is the following:
Error: tx error: error raised while creating client: failed while querying src chain (stargate) for latest height: Light client error for RPC address http://localhost:26657/: node at http://localhost:26657/ running chain stargate not caught up.
I think i just need to wait for the node to get in sync (?)
yes, it looks like that. Has the sync finished? What version of hermes are you running?
No, it still catching up. Do you have any idea how much time/space this process should take?
What is the output of http://localhost:26657/status
?
Here it is:
{
"jsonrpc": "2.0",
"id": -1,
"result": {
"node_info": {
"protocol_version": {
"p2p": "8",
"block": "11",
"app": "0"
},
"id": "729d39e3146fe9871b33c2d88eb0030c38995805",
"listen_addr": "tcp://0.0.0.0:26656",
"network": "cosmoshub-4",
"version": "v0.34.9",
"channels": "40202122233038606100",
"moniker": "heliax-1",
"other": {
"tx_index": "on",
"rpc_address": "tcp://127.0.0.1:26657"
}
},
"sync_info": {
"latest_block_hash": "F00C8384734FE5FE70A33581C9CCF1770E087DAE8197DEC1DB2D08AF40E826CD",
"latest_app_hash": "20F4CAB46E5A81D03D2D39EB3073AD8A834BA7B0CBA02288E085D872BD3ADD21",
"latest_block_height": "5978609",
"latest_block_time": "2021-04-24T14:10:05.321924709Z",
"earliest_block_hash": "1455A0C15AC49BB506992EC85A3CD4D32367E53A087689815E01A524231C3ADF",
"earliest_app_hash": "E3B0C44298FC1C149AFBF4C8996FB92427AE41E4649B934CA495991B7852B855",
"earliest_block_height": "5200791",
"earliest_block_time": "2019-12-11T16:11:34Z",
"catching_up": true
},
"validator_info": {
"address": "49587EBF98D03A58115682BEA6B0D8CB585EA4F7",
"pub_key": {
"type": "tendermint/PubKeyEd25519",
"value": "qwmnJLq4afdHSSkAM24hmuQ0j+HjiQJEiXQ8lF81Uys="
},
"voting_power": "0"
}
}
}
Yes, it is still catching up. It is using 300+gb of space. Do you know any configuration settings to prune old blocks?
Hi @Fraccaman,
I have a full node synced and it's taking 337GB as of now (height 6037600) this is with pruning = nothing.
Concerning custom pruning if you do gaiad start help
it gives you some pruning options, you can set in start command or in the config.
--pruning string Pruning strategy (default|nothing|everything|custom) (default "default")
--pruning-interval uint Height interval at which pruned heights are removed from disk (ignored if pruning is not 'custom')
--pruning-keep-every uint Offset heights to keep on disk after 'keep-every' (ignored if pruning is not 'custom')
--pruning-keep-recent uint Number of recent heights to keep on disk (ignored if pruning is not 'custom')
if you're running this node on AWS, one thing I noticed when syncing mine was that instance type and size and volume speed makes a big difference.
Yes, it is still catching up
at 8-10 blocks/ sec sync speed this will take another ~20 hrs probably.
if you're running this node on AWS, one thing I noticed when syncing mine was that instance type and size and volume speed makes a big difference.
Yes, im running on a EC2 t3.xlarge machine (which may not be the best instance). I'll try with a non-burstable machine next time. @andynog Do you know if IBC works also with a pruned node? @ancazamfir With this machine I'm doing 2/3 blocks/ sec. Should be ready by tomorrow.
Tried running hermes tx raw create-client $IBC0 $IBC1
and hermes tx raw create-client $IBC0 $IBC1
but they both return the same error:
[ec2-user]$ hermes tx raw create-client $IBC0 $IBC1
Error: tx error: error raised while creating client: failed sending message to dst chain (heliax) with err: GRPC error: GRPC error: status: NotFound, message: "rpc error: code = NotFound desc = account cosmos12dpq5dy339t4xgx064rx8cspzswxjzcn7rgy0k not found: key not found", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc"}
[ec2-user]$ hermes tx raw create-client $IBC1 $IBC0
Error: tx error: error raised while creating client: failed sending message to dst chain (stargate) with err: GRPC error: GRPC error: status: NotFound, message: "rpc error: code = NotFound desc = account cosmos1dnsgp73u4tfjq656r24y9xphhtj6wyv3pjq50x not found: key not found", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc"} }
Any idea?
@Fraccaman this account cosmos1dnsgp73u4tfjq656r24y9xphhtj6wyv3pjq50x need to have some balance on it. If this is just a new account you will have to load some tokens on it to be able to pay for the transactions. This account only 'exists' on-chain when you do a send transaction sending tokens to it.
Also, I'm assuming you did the hermes keys add
command
Hi @Fraccaman, could you figure out the keys and account issues?
Sorry, I have been a little busy, ill try next week!
Hi @Fraccaman, any updates on this? We believe this might not be an issue anymore since several changes have been implemented on the relayer since then. We might close this issue for now if we don't hear back in the next few days, but feel free to open it again if the problem persists.
@andynog sorry for the delay, some things have changed and I had to resync the chain which is taking a lot (1week+). Im almost finished synching, maybe a couple of days, and ill be back with some updates.
Im still synching the chain. Im at height 6106333
but its really slow (something like 1block/s).
Okay made some progress. I have been able to download a snapshot from https://cosmos.quicksync.io/
which speeds up things a lot. So, as I have both node up and running, addresscosmos1x60z62swcm4j5ct4l5xpmcln3fdkvp8kylqv07
has some ATOM, and I'm now trying to create a channel between them.
Building the Rust relayer...
Importing keys...
Finished dev [unoptimized + debuginfo] target(s) in 0.13s
Running `target/debug/hermes -c /home/ec2-user/.hermes/config.toml keys add stargate -f /home/ec2-user/node-stargate/key_seed.json`
Success: Added key 'node_key' (cosmos1x60z62swcm4j5ct4l5xpmcln3fdkvp8kylqv07) on chain stargate
Finished dev [unoptimized + debuginfo] target(s) in 0.13s
Running `target/debug/hermes -c /home/ec2-user/.hermes/config.toml keys add heliax -f /home/ec2-user/node-heliax/key_seed.json`
Success: Added key 'node_key' (cosmos1uvp3k66wl246jzcgl7etg80ly94mvtq84gr98x) on chain heliax
Done!
Error: tx error: error raised while creating client: failed while building client state from src chain (stargate) with error: GRPC error: status: Unimplemented, message: "unknown service cosmos.staking.v1beta1.Query", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc"} }
Error: tx error: error raised while creating client: failed while building client state from src chain (heliax) with error: GRPC error: status: Unimplemented, message: "unknown service cosmos.staking.v1beta1.Query", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc"} }
Error: tx error: error raised while creating client: failed while building client state from src chain (heliax) with error: GRPC error: status: Unimplemented, message: "unknown service cosmos.staking.v1beta1.Query", details: [], metadata: MetadataMap { headers: {"content-type": "application/grpc"} }
Error: Query of client '07-tendermint-0' on chain 'heliax' failed with error: Query error occurred (failed to query for client state): error converting message type into domain type: the client state was not found
Error: config error: missing chain for id (irishub-1) in configuration file
Hermes version:
[ec2-user@ip-172-31-19-38 ibc-setup]$ $BINARY version
hermes 0.4.0
Gaiad version:
[ec2-user@ip-172-31-19-38 ibc-setup]$ gaiad version
v4.2.1
Okay, I had some bugs in the hermes config. Ill keep you posted.
More updates. I have been able to create a client between the 2 chains 🎉🎉🎉 Here the output:
[ec2-user@ip-172-31-19-38 ibc-setup]$ $BINARY query client state $IBC1 07-tendermint-252
Success: ClientState {
chain_id: ChainId {
id: "h3liax",
version: 0,
},
trust_level: TrustThresholdFraction {
numerator: 1,
denominator: 3,
},
trusting_period: 1209600s,
unbonding_period: 1814400s,
max_clock_drift: 5s,
frozen_height: Height {
revision: 0,
height: 0,
},
latest_height: Height {
revision: 0,
height: 3437,
},
upgrade_path: [
"upgrade",
"upgradedIBCState",
],
allow_update: AllowUpdate {
after_expiry: false,
after_misbehaviour: false,
},
}
[ec2-user@ip-172-31-19-38 ibc-setup]$ $BINARY query client state $IBC0 07-tendermint-2
Success: ClientState {
chain_id: ChainId {
id: "cosmoshub-4",
version: 4,
},
trust_level: TrustThresholdFraction {
numerator: 1,
denominator: 3,
},
trusting_period: 1209600s,
unbonding_period: 1814400s,
max_clock_drift: 5s,
frozen_height: Height {
revision: 0,
height: 0,
},
latest_height: Height {
revision: 4,
height: 6614024,
},
upgrade_path: [
"upgrade",
"upgradedIBCState",
],
allow_update: AllowUpdate {
after_expiry: false,
after_misbehaviour: false,
},
}
Following the tutorial, I'm trying to create a connection and this one is failing.
The first command (conn-init
) works:
$BINARY tx raw conn-init $IBC0 $IBC1 07-tendermint-2 07-tendermint-252
Success: OpenInitConnection(
OpenInit(
Attributes {
height: Height {
revision: 0,
height: 4474,
},
connection_id: Some(
ConnectionId(
"connection-0",
),
),
client_id: ClientId(
"07-tendermint-2",
),
counterparty_connection_id: None,
counterparty_client_id: ClientId(
"07-tendermint-252",
),
},
),
)
Second command (conn-try
) should be working:
$BINARY tx raw conn-try $IBC1 $IBC0 07-tendermint-252 07-tendermint-2 -s connection-0
Error: tx error: failed during a transaction submission step to chain id cosmoshub-4 with underlying error: RPC error to endpoint http://localhost:26657/: RPC error to endpoint http://localhost:26657/: Internal error: timed out waiting for tx to be included in a block (code: -32603)
I think it ran successfully, but I need to increase the rpc timeout.
Third command (conn-ack
) fails:
$BINARY tx raw conn-ack $IBC0 $IBC1 07-tendermint-2 07-tendermint-252 -d connection-0 -s connection-1
Error: tx error: failed with underlying cause: tx response error: deliver_tx reports error: log=Log("failed to execute message; message index: 1: connection handshake open ack failed: failed connection state verification for client (07-tendermint-2): chained membership proof failed to verify membership of value: 0A1130372D74656E6465726D696E742D32353212230A0131120D4F524445525F4F524445524544120F4F524445525F554E4F524445524544180222260A0F30372D74656E6465726D696E742D32120C636F6E6E656374696F6E2D301A050A03696263 in subroot C42B7ED48FB47D14FC71C73C510D1687C981E1827F52DDCDC80FED215C6674BA at index 0. Please ensure the path and value are both correct.: invalid proof")
Do you have any suggestions?
P.s: every command dealing with cosmoshub-4 chain "fails" with the same timeout error.
I made a little progress but I'm still stuck at the same "step". I avoided the timeout error by setting in gaia timeout_broadcast_tx_commit
to 300s
. The error comes from the conn-try
command(and not the conn-ack
) and is the following:
Error: tx error: failed with underlying cause: tx response error: deliver_tx reports error: log=Log("failed to execute message; message index: 1: connection handshake open try failed: failed connection state verification for client (07-tendermint-260): chained membership proof failed to verify membership of value: 0A0F30372D74656E6465726D696E742D3012230A0131120D4F524445525F4F524445524544120F4F524445525F554E4F5244455245441801221A0A1130372D74656E6465726D696E742D3236301A050A03696263 in subroot 3A9C01E534577A1D3BD6AD66742C7A1CFE16610347EB7F677FD636EFD9D0C50F at index 0. Please ensure the path and value are both correct.: invalid proof")
You can see the transactions here.
Crate
Version
v0.5.0
Summary
Hermes fails with
TxNoConfirmation
error constantly when trying to communicate with a full node that does not have indexing enabled.The error can be tracked down using a patched version of Hermes, and looks like this:
Many thanks to @Fraccaman for helping us uncover this corner-case.
A separate, related problem was due to misconfiguration (see https://github.com/heliaxdev/ibc-setup/pull/1).
Acceptance criteria:
TxNoConfirmation
)Original discussion
Summary of Bug
Probably this is not a bug, but I can't understand whats wrong. I'm trying to open a channel between a cosmos mainnet node and a "own gaia testnet" node. I am able to build the relayer and configure it correctly (It doesn't complain so I'm assuming it is correct) but as soon as I try to run
hermes channel handshake id-1 id-2 transfer transfer
it throw the following error:{"status":"error","result":"chain runtime/handle error: Light client supervisor error for chain id heliax: empty witness list"}
.p.s: @andynog suggested to open an issue
Version
Steps to Reproduce
I have created a repository with a series of bash scripts to reproduce this use case here.
For Admin Use