Closed onepabz closed 7 months ago
Was the bitcoind
node updated to a new version? If so, from which version to which?
That error looks like it just isn't able to get info in time so it times out: https://github.com/lightninglabs/lndmon/blob/2d4e987b3f0414a3dfe51f05e672867e3257aeb0/collectors/chain_collector.go#L71-L76
There's a timeout value here: https://github.com/lightninglabs/lndclient/blob/04c46b8af9172ca1355f9a1ee416368e97f0aa0d/lightning_client.go#L1330-L1331
So we can set that when we make the client: https://github.com/lightninglabs/lndclient/blob/04c46b8af9172ca1355f9a1ee416368e97f0aa0d/lnd_services.go#L295-L298
Bigger question here tho is: why is that bitcoind
slower, or did lnd
get slower?
Was the bitcoind node updated to a new version? If so, from which version to which?
The only difference is the type of disk that the persistent volume in Kubernetes (GKE) uses underneath, switching from an SSD to a non-SSD one. What is weird is that both lnd and bitcoind appear to be functioning properly
Bigger question here tho is: why is that bitcoind slower, or did lnd get slower?
both appear to be working fine and no other consumers complaining...
After using ssd disks again in the bitcoind pods, lndmon has stopped crashing , thanks four your help @Roasbeef
We've been successfully running lndmon for a long time. Recently, we changed our bitcoind nodes, and since then, all our lndmon pods have been crashing every few hours. Lnd works fine and all the other lnd auxialiary services work fine, only lndmon keeps crashing.
Here's what I see in the logs:
Lndmon exiting with error: ChainCollector GetInfo failed with: rpc error: code = DeadlineExceeded desc = context deadline exceeded
We are using the latest lndmon version, v0.2.7.
Ive tried increasing prometheus scrape interval/timeout but lndmon keeps crashing
Any help would be much appreciated.