Open activenodes opened 1 year ago
Thanks for the feedback, I was thinking that I'm not the only one with this problem, how do you use unix socket on tmkms ?
how do you use unix socket on tmkms?
we use it for local connection only, so tmkms should be running on the same host (n.b. there are cons to this approach, as you may guess).
so for the cosmos chain
# config.toml
[priv-validator]
laddr = "unix://path/to/somewhere/kms.sock"
and in tmkms config you set the same address
[[validator]]
addr = "unix://path/to/somewhere/kms.sock"
(it's pretty late for me, so I hope I copypasted the correct thing)
anyway will look into it shortly (on Wed or Thu) and try to make it work
UPD: IIRC, the cosmos chain will create the socket, so tmkms will try connect to it. I would also recommend cleaning up (i.e. removing) the socket on every (re)start
I can reopen this issue, however really issues should be filed against tendermint-p2p
:, similar to this one: https://github.com/informalsystems/tendermint-rs/issues/1356
Do you have any news on this, if I can help with anything, unfortunately we're using tmkms on a remote machine, we can't make a Unix Socket connection, in the short term isn't there a workaround ? Thanks
@IbrarMakaveli I filed https://github.com/informalsystems/tendermint-rs/issues/1392 to request upstream help debugging this problem.
What would be extremely helpful here is if someone could add reproduction instructions to that issue, especially if the issue is reproducible directly via the tendermint-p2p
crate without involving TMKMS (or isolating TMKMS as the problem)
Can someone attempt to reproduce this on a fresh install, which should use tendermint-p2p
v0.34.1?
Rolled out 0.14.0-pre.1
to our Sei testnet validator, will let you know
Using tcp connection within the same host
Unfortunately the same error
Several signed blocks, then an underflow error
2024-03-05T20:18:43.118481Z DEBUG tmkms::session: [atlantic-2@tcp://...:51759] received request: ShowPublicKey
2024-03-05T20:18:43.118506Z DEBUG tmkms::session: [atlantic-2@tcp://...:51759] sending response: PublicKey(PubKeyResponse { pub_key: Some(PublicKey { sum: Some(Ed25519([162>
2024-03-05T20:18:43.314290Z ERROR tmkms::client: [atlantic-2@tcp://...:51759] protocol error: malformed message packet: failed to decode Protobuf message: buffer underflow
2024-03-05T20:18:44.314384Z DEBUG tmkms::session: [atlantic-2@tcp://...:51759] connecting to validator...
2024-03-05T20:18:44.314456Z INFO tmkms::connection::tcp: KMS node ID: ...
2024-03-05T20:18:44.314862Z INFO tmkms::session: [atlantic-2@tcp://...:51759] connected to validator successfully
2024-03-05T20:18:44.314869Z WARN tmkms::session: [atlantic-2@tcp://...:51759]: unverified validator peer ID! (a47c7867b3191c93eed4bf0f01a9d4bc95a193ac)
2024-03-05T20:18:44.414468Z ERROR tmkms::client: [atlantic-2@tcp://...:51759] protocol error: malformed message packet: failed to decode Protobuf message: buffer underflow
@qezz can you confirm that tendermint-p2p
v0.34.1 was used in the build? (I can pin it in the next prerelease)
let me check
In the build log it says
...
Downloaded tendermint-p2p v0.34.1
...
Compiling tendermint-proto v0.34.1
Compiling yubihsm v0.42.1
Compiling tendermint v0.34.1
Compiling cosmos-sdk-proto v0.20.0
Compiling tendermint-p2p v0.34.1
Compiling tendermint-config v0.34.1
Thanks, I reopened this issue: https://github.com/informalsystems/tendermint-rs/issues/1392#issuecomment-1979592089
We're running into the same issue very consistently with Initia's testnet when we turn on the oracle, which I believes adds a significant amount of data to the TMKMS requests.
2024-06-14T01:46:26.861895Z ERROR tmkms::client: [initiation-1@tcp://validator:26658] protocol error: malformed message packet: failed to decode Protobuf message: buffer underflow
2024-06-14T01:46:27.862227Z DEBUG tmkms::session: [initiation-1@tcp://validator:26658] connecting to validator...
2024-06-14T01:46:27.864663Z INFO tmkms::session: [initiation-1@tcp://validator:26658] connected to validator successfully
Has anyone attempted to build with @zarkone's upstream PR?
@datanexus-vincent did you try to build on top of: https://github.com/iqlusioninc/tmkms/pull/903/files
I think that shall fix your issue (as it did fix it for SEI). Though we haven't checked the Initia yet.
@tony-iqlusion we'd appreciate your PR review :)
@mkaczanowski I did once I realized it wasn't an upstream PR but a PR for this repo, and it worked! Thanks for the effort to get that working.
Chains with:
github.com/tendermint/tendermint@v0.37.0-dev
~350ms
Every ~40 sigratures (softsign due to low time blocks) connection go down with this error:
protocol error: malformed message packet: failed to decode Protobuf message: buffer underflow
Let me know if you need any more details.
Thanks @tony-iqlusion