Closed openoms closed 4 years ago
Not good... It looks like it might panic. Unfortunately the stack trace is not logged to the log file. Do you have a systemd
log that might contain the stack trace?
Don't have much in journalctl:
$ sudo journalctl -u lnd
-- Logs begin at Tue 2020-06-23 08:09:25 BST, end at Tue 2020-06-23 09:27:18 BST. --
Jun 23 08:12:21 raspberrypi systemd[1]: Starting LND Lightning Daemon...
Jun 23 08:12:21 raspberrypi systemd[1]: Started LND Lightning Daemon.
Jun 23 08:20:22 raspberrypi systemd[1]: lnd.service: Main process exited, code=killed, status=11/SEGV
Jun 23 08:20:22 raspberrypi systemd[1]: lnd.service: Failed with result 'signal'.
Jun 23 08:21:22 raspberrypi systemd[1]: lnd.service: Service RestartSec=1min expired, scheduling restart.
Jun 23 08:21:22 raspberrypi systemd[1]: lnd.service: Scheduled restart job, restart counter is at 1.
Jun 23 08:21:22 raspberrypi systemd[1]: Stopped LND Lightning Daemon.
Jun 23 08:21:22 raspberrypi systemd[1]: Starting LND Lightning Daemon...
Jun 23 08:21:22 raspberrypi systemd[1]: Started LND Lightning Daemon.
Jun 23 08:29:18 raspberrypi systemd[1]: lnd.service: Main process exited, code=killed, status=11/SEGV
Jun 23 08:29:18 raspberrypi systemd[1]: lnd.service: Failed with result 'signal'.
Jun 23 08:30:18 raspberrypi systemd[1]: lnd.service: Service RestartSec=1min expired, scheduling restart.
Jun 23 08:30:18 raspberrypi systemd[1]: lnd.service: Scheduled restart job, restart counter is at 2.
Jun 23 08:30:18 raspberrypi systemd[1]: Stopped LND Lightning Daemon.
Jun 23 08:30:18 raspberrypi systemd[1]: Starting LND Lightning Daemon...
Jun 23 08:30:18 raspberrypi systemd[1]: Started LND Lightning Daemon.
now monitoring with strace:
$ pidof lnd
26837
$ sudo strace -p 26837 -v
strace: Process 26837 attached
futex(0x179ab00, FUTEX_WAIT_PRIVATE, 0, NULL
Interestingly it did not fail since the last (number 3or 4) restart. Will keep en eye and report if happens again.
Oh, maybe that binary was built with go1.14
but not all required fixes were included. Can you run lncli version
and tell me what go version it prints?
You are probably right:
"commit": "v0.10.2-beta.rc2",
"commit_hash": "de53605277a658fcde9a0bc690876000d390fca6",
"go_version": "go1.14.4"
still up BTW
I presume the problem is similar to this: https://github.com/lightningnetwork/lnd/issues/4052 which has been solved here: https://github.com/lightningnetwork/lnd/pull/4061 ?
Yes, I assume that's the problem. We need to build the v0.10.2
release with go1.13
, only v0.11.0
will be go1.14
compatible.
Can you try out this update branch @openoms: https://github.com/lightningnetwork/lnd/tree/v0.10.2-beta-rc2-branch? Updated one of the deps to match that commit linked (which I think is the issue?).
Also, if you compile manually with Go 1.13 is it able to remain up?
got this with with strace
while it failed overnight:
futex(0x178bccc, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
clock_gettime(CLOCK_MONOTONIC, {tv_sec=63281, tv_nsec=515385968}) = 0
nanosleep({tv_sec=0, tv_nsec=3000}, NULL) = 0
futex(0x178bccc, FUTEX_WAIT_PRIVATE, 0, NULL) = 0
clock_gettime(CLOCK_MONOTONIC, {tv_sec=63739, tv_nsec=670928200}) = 0
futex(0x178bccc, FUTEX_WAIT_PRIVATE, 0, NULL) = ?
+++ killed by SIGSEGV +++
will complile https://github.com/lightningnetwork/lnd/tree/v0.10.2-beta-rc2-branch with go go1.13.3.
Looking good with the commit https://github.com/lightningnetwork/lnd/commit/73dcdf9e58a743b0f82b2fafd7f2bda90fc91665
$ lncli version
{
"lncli": {
"commit": "v0.10.2-beta.rc2-1-g73dcdf9e58a743b0f82b2fafd7f2bda90fc91665",
"commit_hash": "73dcdf9e58a743b0f82b2fafd7f2bda90fc91665",
"version": "0.10.2-beta.rc2",
"app_major": 0,
"app_minor": 10,
"app_patch": 2,
"app_pre_release": "beta.rc2",
"build_tags": [
],
"go_version": "go1.13.3"
},
"lnd": {
"commit": "v0.10.2-beta.rc2-1-g73dcdf9e58a743b0f82b2fafd7f2bda90fc91665",
"commit_hash": "73dcdf9e58a743b0f82b2fafd7f2bda90fc91665",
"version": "0.10.2-beta.rc2",
"app_major": 0,
"app_minor": 10,
"app_patch": 2,
"app_pre_release": "beta.rc2",
"build_tags": [
],
"go_version": "go1.13.3"
}
}
Will keep an eye on it.
No issues after 36 h, seems to be fixed. Should I try building with Go 1.14?
Yes, if you don't mind testing that. Would be great to know if it actually is the go version or something else.
ok testing with go 1.14.4 now
$ lncli version
{
"lncli": {
"commit": "v0.10.2-beta.rc2-1-g73dcdf9e58a743b0f82b2fafd7f2bda90fc91665",
"commit_hash": "73dcdf9e58a743b0f82b2fafd7f2bda90fc91665",
"version": "0.10.2-beta.rc2",
"app_major": 0,
"app_minor": 10,
"app_patch": 2,
"app_pre_release": "beta.rc2",
"build_tags": [
],
"go_version": "go1.14.4"
},
"lnd": {
"commit": "v0.10.2-beta.rc2-1-g73dcdf9e58a743b0f82b2fafd7f2bda90fc91665",
"commit_hash": "73dcdf9e58a743b0f82b2fafd7f2bda90fc91665",
"version": "0.10.2-beta.rc2",
"app_major": 0,
"app_minor": 10,
"app_patch": 2,
"app_pre_release": "beta.rc2",
"build_tags": [
],
"go_version": "go1.14.4"
}
}
We've switched over to building the upcoming minor releases using just 1.13.3, but would be curious to see if 1.14.4 works with that existing branch still.
Running the branch: https://github.com/lightningnetwork/lnd/commits/v0.10.3-beta-rc1-branch with go1.14.4
All good so far.
$ lncli version
{
"lncli": {
"commit": "v0.10.2-beta.rc4-26-gcda3088a0159516a403062db480425b6cbbae6c9",
"commit_hash": "cda3088a0159516a403062db480425b6cbbae6c9",
"version": "0.10.2-beta.rc4",
"app_major": 0,
"app_minor": 10,
"app_patch": 2,
"app_pre_release": "beta.rc4",
"build_tags": [
],
"go_version": "go1.14.4"
},
"lnd": {
"commit": "v0.10.2-beta.rc4-26-gcda3088a0159516a403062db480425b6cbbae6c9",
"commit_hash": "cda3088a0159516a403062db480425b6cbbae6c9",
"version": "0.10.2-beta.rc4",
"app_major": 0,
"app_minor": 10,
"app_patch": 2,
"app_pre_release": "beta.rc4",
"build_tags": [
],
"go_version": "go1.14.4"
}
}
Sorry closed prematurely. LND 0.10.3 started to have restarts again on a more active node.
$ lncli version
{
"lncli": {
"commit": "v0.10.3-beta",
"commit_hash": "d62c575f8499a314eb27f12462d20500b6bda2c7",
"version": "0.10.3-beta",
"app_major": 0,
"app_minor": 10,
"app_patch": 3,
"app_pre_release": "beta",
"build_tags": [
"autopilotrpc",
"signrpc",
"walletrpc",
"chainrpc",
"invoicesrpc",
"watchtowerrpc"
],
"go_version": "go1.14.4"
},
"lnd": {
"commit": "v0.10.3-beta",
"commit_hash": "d62c575f8499a314eb27f12462d20500b6bda2c7",
"version": "0.10.3-beta",
"app_major": 0,
"app_minor": 10,
"app_patch": 3,
"app_pre_release": "beta",
"build_tags": [
"autopilotrpc",
"signrpc",
"walletrpc",
"chainrpc",
"invoicesrpc",
"watchtowerrpc"
],
"go_version": "go1.14.4"
}
}
nothing in the lnd.log, only this again:
$ sudo strace -p 31706 -v
strace: Process 31706 attached
futex(0x178bcd4, FUTEX_WAIT_PRIVATE, 0, NULL) = -1 (errno 4294967056)
+++ killed by SIGSEGV +++
This node is:
Odroid HC1
32 bit Armbian
Linux 5.4.28-odroidxu4 armv7l GNU/Linux
Bitcoin Core version v0.20.0
The strange thing is that the same lnd version was stable since 48h+ on two RPi4-s I have updated first. Now it ahs restarted there also.
Switched to lnd v0.10.2-beta
now (go_version": "go1.14.4).
Will continue to report.
Same with v0.10.2-beta
$ sudo strace -p 6311 -v
strace: Process 6311 attached
futex(0x178bcec, FUTEX_WAIT_PRIVATE, 0, NULL ) = ?
+++ killed by SIGSEGV +++
And now the node is back to lnd v0.10.1-beta
with "go_version": "go1.13.10".
This still looks to be an issue related to the Go version and only realised on a busy node.
Will build from source with Go 1.13.30 again.
Hmm, ok I think we may re-upload the binaries, but a version compiled using Go 1.13. This'll give us time to properly look into this so we can have things working properly for the major 0.11 release.
I see the same issue when updating to 0.10.3 on my Raspberrypi 4. I am using the release binary. Frequent crashes without an error log.
I see the same issue when updating to 0.10.3 on my Raspberrypi 4. I am using the release binary. Frequent crashes without an error log.
v0.10.3-beta is stable when built from source with Go 1.13.3. To use a binary need to downgrade to v0.10.1-beta.
Yes, just wanted to leave the comment here so that others can find this issue when they try to upgrade using the release binary.
@Roasbeef thanks for building https://github.com/lightningnetwork/lnd/releases/tag/v0.10.4-beta with the stable Go version. Updating now.
admin@raspberrypi:~ $ lncli -n testnet version
{
"lncli": {
"commit": "v0.10.4-beta",
"commit_hash": "86114c575c2dff9dff1e1bb4df961c64aea9fc1c",
"version": "0.10.4-beta",
"app_major": 0,
"app_minor": 10,
"app_patch": 4,
"app_pre_release": "beta",
"build_tags": [
"autopilotrpc",
"signrpc",
"walletrpc",
"chainrpc",
"invoicesrpc",
"watchtowerrpc"
],
"go_version": "go1.13.13"
},
"lnd": {
"commit": "v0.10.4-beta",
"commit_hash": "86114c575c2dff9dff1e1bb4df961c64aea9fc1c",
"version": "0.10.4-beta",
"app_major": 0,
"app_minor": 10,
"app_patch": 4,
"app_pre_release": "beta",
"build_tags": [
"autopilotrpc",
"signrpc",
"walletrpc",
"chainrpc",
"invoicesrpc",
"watchtowerrpc"
],
"go_version": "go1.13.13"
}
}
LND v0.11.0-beta.rc1
with go1.14.6
is stable since 24h+ with numerous payments and routing events so closing this for good. Thank you for the support!
$ lncli version
{
"lncli": {
"commit": "v0.11.0-beta.rc1",
"commit_hash": "247b7530caf08a555ffd56f81019031bc1af6565",
"version": "0.11.0-beta.rc1",
"app_major": 0,
"app_minor": 11,
"app_patch": 0,
"app_pre_release": "beta.rc1",
"build_tags": [
"autopilotrpc",
"signrpc",
"walletrpc",
"chainrpc",
"invoicesrpc",
"watchtowerrpc"
],
"go_version": "go1.14.6"
},
"lnd": {
"commit": "v0.11.0-beta.rc1",
"commit_hash": "247b7530caf08a555ffd56f81019031bc1af6565",
"version": "0.11.0-beta.rc1",
"app_major": 0,
"app_minor": 11,
"app_patch": 0,
"app_pre_release": "beta.rc1",
"build_tags": [
"autopilotrpc",
"signrpc",
"walletrpc",
"chainrpc",
"invoicesrpc",
"watchtowerrpc"
],
"go_version": "go1.14.6"
}
}
Background
Updating to LND v0.10.2-beta.rc2 causes random restarts. Downgrading to LND v0.10.1-beta solves the issue and runs stable.
Your environment
lnd v0.10.2-beta.rc2
operating system Latest Raspbian / Armbian 32bit Linux raspberrypi 4.19.118-v7l+ armv7l GNU/Linux Linux HC1 5.4.28-odroidxu4 armv7l GNU/Linux
version of
btcd
,bitcoind
, or other backend Same with Bitcoin Core 0.20.0 and 0.19.1any other relevant environment details RaspiBlitz builds on two different SBC-s RPI4 4GB and Odroid HC1 2GB Related issue: https://github.com/rootzoll/raspiblitz/issues/1240
Steps to reproduce
update lnd to the latest release installed the binary
Expected behaviour
Expected to function without restarts
Actual behaviour
LND restarts after a few minutes without any obvious reason no fails or errors I could find in the logs. Last 2000 lines from lnd.log from two separate occasions are here: https://termbin.com/6t6t https://termbin.com/hrmpp
Please tell how I can help debug further.