dashevo / dashmate

DEPRECATED Distribution package for Dash Masternode installation
MIT License
11 stars 12 forks source link

Containers restarting running masternode #104

Closed dkaparis closed 4 years ago

dkaparis commented 4 years ago

Expected Behavior

Docker containers dashmasternode* should stay up

Current Behavior

Containers dash_masternode_evonet_drive_tendermint_1, dash_masternode_evonet_drive_abci_1 and dash_masternode_evonet_core_1 are continuously restarting

Partial log outputs:

docker logs dash_masternode_evonet_drive_tendermint_1:

...
goroutine 1 [running]:
github.com/tendermint/tendermint/privval.FilePVKey.Save(0xc000f0c8e0, 0x14, 0x20, 0x11c00e0, 0xc000f0c900, 0x11c00a0, 0xc0000f5c80, 0xc0009fc760, 0x1d)
    /go/src/github.com/tendermint/tendermint/privval/file.go:61 +0x182
github.com/tendermint/tendermint/privval.(*FilePV).Save(0xc0009fa460)
    /go/src/github.com/tendermint/tendermint/privval/file.go:262 +0xa1
github.com/tendermint/tendermint/cmd/tendermint/commands.initFilesWithConfig(0xc000ef2000, 0x0, 0x0)
    /go/src/github.com/tendermint/tendermint/cmd/tendermint/commands/init.go:37 +0xbf6
github.com/tendermint/tendermint/cmd/tendermint/commands.initFiles(0x1876080, 0x18a09c8, 0x0, 0x0, 0x0, 0x0)
    /go/src/github.com/tendermint/tendermint/cmd/tendermint/commands/init.go:23 +0x2d
github.com/spf13/cobra.(*Command).execute(0x1876080, 0x18a09c8, 0x0, 0x0, 0x1876080, 0x18a09c8)
    /go/pkg/mod/github.com/spf13/cobra@v0.0.1/command.go:698 +0x42c
github.com/spf13/cobra.(*Command).ExecuteC(0x18773a0, 0x2, 0xc00000cb60, 0xf43e89)
    /go/pkg/mod/github.com/spf13/cobra@v0.0.1/command.go:783 +0x2c9
github.com/spf13/cobra.(*Command).Execute(...)
    /go/pkg/mod/github.com/spf13/cobra@v0.0.1/command.go:736
github.com/tendermint/tendermint/libs/cli.Executor.Execute(0x18773a0, 0x1093810, 0x2, 0xc00011bba0)
    /go/src/github.com/tendermint/tendermint/libs/cli/setup.go:89 +0x3c
main.main()
    /go/src/github.com/tendermint/tendermint/cmd/tendermint/main.go:45 +0x248
panic: open /data/write-file-atomic-8376114759814144188: permission denied

goroutine 1 [running]:
github.com/tendermint/tendermint/privval.FilePVKey.Save(0xc000f0c8e0, 0x14, 0x20, 0x11c00e0, 0xc000f0c900, 0x11c00a0, 0xc0000f5c80, 0xc0009fa760, 0x1d)
    /go/src/github.com/tendermint/tendermint/privval/file.go:61 +0x182
github.com/tendermint/tendermint/privval.(*FilePV).Save(0xc0009f8500)
    /go/src/github.com/tendermint/tendermint/privval/file.go:262 +0xa1
github.com/tendermint/tendermint/cmd/tendermint/commands.initFilesWithConfig(0xc000ef2000, 0x0, 0x0)
    /go/src/github.com/tendermint/tendermint/cmd/tendermint/commands/init.go:37 +0xbf6
github.com/tendermint/tendermint/cmd/tendermint/commands.initFiles(0x1876080, 0x18a09c8, 0x0, 0x0, 0x0, 0x0)
    /go/src/github.com/tendermint/tendermint/cmd/tendermint/commands/init.go:23 +0x2d
github.com/spf13/cobra.(*Command).execute(0x1876080, 0x18a09c8, 0x0, 0x0, 0x1876080, 0x18a09c8)
    /go/pkg/mod/github.com/spf13/cobra@v0.0.1/command.go:698 +0x42c
github.com/spf13/cobra.(*Command).ExecuteC(0x18773a0, 0x2, 0xc00000cb80, 0xf43e89)
    /go/pkg/mod/github.com/spf13/cobra@v0.0.1/command.go:783 +0x2c9
github.com/spf13/cobra.(*Command).Execute(...)
    /go/pkg/mod/github.com/spf13/cobra@v0.0.1/command.go:736
github.com/tendermint/tendermint/libs/cli.Executor.Execute(0x18773a0, 0x1093810, 0x2, 0xc000119ba0)
    /go/src/github.com/tendermint/tendermint/libs/cli/setup.go:89 +0x3c
main.main()
    /go/src/github.com/tendermint/tendermint/cmd/tendermint/main.go:45 +0x248
panic: open /data/write-file-atomic-08291719565985660993: permission denied

goroutine 1 [running]:
github.com/tendermint/tendermint/privval.FilePVKey.Save(0xc000f088e0, 0x14, 0x20, 0x11c00e0, 0xc000f08900, 0x11c00a0, 0xc0000f5c80, 0xc0009f6760, 0x1d)
    /go/src/github.com/tendermint/tendermint/privval/file.go:61 +0x182
github.com/tendermint/tendermint/privval.(*FilePV).Save(0xc0009f25a0)
    /go/src/github.com/tendermint/tendermint/privval/file.go:262 +0xa1
github.com/tendermint/tendermint/cmd/tendermint/commands.initFilesWithConfig(0xc000ee8000, 0x0, 0x0)
    /go/src/github.com/tendermint/tendermint/cmd/tendermint/commands/init.go:37 +0xbf6
github.com/tendermint/tendermint/cmd/tendermint/commands.initFiles(0x1876080, 0x18a09c8, 0x0, 0x0, 0x0, 0x0)
    /go/src/github.com/tendermint/tendermint/cmd/tendermint/commands/init.go:23 +0x2d
github.com/spf13/cobra.(*Command).execute(0x1876080, 0x18a09c8, 0x0, 0x0, 0x1876080, 0x18a09c8)
    /go/pkg/mod/github.com/spf13/cobra@v0.0.1/command.go:698 +0x42c
github.com/spf13/cobra.(*Command).ExecuteC(0x18773a0, 0x2, 0xc00000cba0, 0xf43e89)
    /go/pkg/mod/github.com/spf13/cobra@v0.0.1/command.go:783 +0x2c9
github.com/spf13/cobra.(*Command).Execute(...)
    /go/pkg/mod/github.com/spf13/cobra@v0.0.1/command.go:736
github.com/tendermint/tendermint/libs/cli.Executor.Execute(0x18773a0, 0x1093810, 0x2, 0xc00011bba0)
    /go/src/github.com/tendermint/tendermint/libs/cli/setup.go:89 +0x3c
main.main()
    /go/src/github.com/tendermint/tendermint/cmd/tendermint/main.go:45 +0x248

docker logs dash_masternode_evonet_drive_abci_1:

...
> @dashevo/drive@0.13.2 abci /usr/src/app
> node scripts/abci

[2020-07-23T11:12:37.852Z] Connecting to MongoDB
{}
[2020-07-23T11:12:37.885Z] Connecting to Core
{}
OperationalError: Dash JSON-RPC: Request Error: getaddrinfo ENOTFOUND core
    at ClientRequest.<anonymous> (/node_modules/@dashevo/dashd-rpc/lib/index.js:131:19)
    at ClientRequest.emit (events.js:315:20)
    at Socket.socketErrorListener (_http_client.js:426:9)
    at Socket.emit (events.js:315:20)
    at emitErrorNT (internal/streams/destroy.js:92:8)
    at emitErrorAndCloseNT (internal/streams/destroy.js:60:3)
    at processTicksAndRejections (internal/process/task_queues.js:84:21)
From previous event:
    at waitForCoreSync (/usr/src/app/lib/core/waitForCoreSyncFactory.js:21:60)
    at main (/usr/src/app/scripts/abci.js:24:9)
    at runMicrotasks (<anonymous>)
    at processTicksAndRejections (internal/process/task_queues.js:97:5) {
  cause: Error: Dash JSON-RPC: Request Error: getaddrinfo ENOTFOUND core
      at ClientRequest.<anonymous> (/node_modules/@dashevo/dashd-rpc/lib/index.js:131:19)
      at ClientRequest.emit (events.js:315:20)
      at Socket.socketErrorListener (_http_client.js:426:9)
      at Socket.emit (events.js:315:20)
      at emitErrorNT (internal/streams/destroy.js:92:8)
      at emitErrorAndCloseNT (internal/streams/destroy.js:60:3)
      at processTicksAndRejections (internal/process/task_queues.js:84:21),
  isOperational: true
}
npm ERR! code ELIFECYCLE
npm ERR! errno 1
npm ERR! @dashevo/drive@0.13.2 abci: `node scripts/abci`
npm ERR! Exit status 1
npm ERR! 
npm ERR! Failed at the @dashevo/drive@0.13.2 abci script.
npm ERR! This is probably not a problem with npm. There is likely additional logging output above.
npm WARN Local package.json exists, but node_modules missing, did you mean to install?

npm ERR! A complete log of this run can be found in:
npm ERR!     /root/.npm/_logs/2020-07-23T11_12_37_917Z-debug.log

docker logs dash_masternode_evonet_core_1:

...
2020-07-23 11:13:40 Dash Core version v0.15.0.0
2020-07-23 11:13:40 InitParameterInteraction: parameter interaction: -masternodeblsprivkey=... -> setting -listen=1
2020-07-23 11:13:40 InitParameterInteraction: parameter interaction: -masternodeblsprivkey=... -> setting -disablewallet=1
2020-07-23 11:13:40 InitParameterInteraction: parameter interaction: -externalip set -> setting -discover=0
2020-07-23 11:13:40 InitParameterInteraction: parameter interaction: -whitelistforcerelay=1 -> setting -whitelistrelay=1
2020-07-23 11:13:40 InitParameterInteraction: parameter interaction: additional indexes -> setting -checklevel=4
2020-07-23 11:13:40 Validating signatures for all blocks.
2020-07-23 11:13:40 Setting nMinimumChainWork=0000000000000000000000000000000000000000000000000000000000000000
2020-07-23 11:13:40 Using the 'sse4(1way),sse41(4way),avx2(8way)' SHA256 implementation
2020-07-23 11:13:40 Using RdRand as an additional entropy source
2020-07-23 11:13:40 

************************
Exception: type=boost::filesystem::filesystem_error, what="boost::filesystem::create_directory: Permission denied: "/dash/data/devnet-evonet-4""
No debug information available for stacktrace. You should add debug information and then run:
dashd -printcrashinfo=bvcgc43iinzgc43ijfxgm3ybaacwiyltnbsjarlymnsxa5djn5xduidupfygkplcn5xxg5b2hjtgs3dfon4xg5dfnu5duztjnrsxg6ltorsw2x3fojzg64rmeb3wqyluhurge33pon2duotgnfwgk43zon2gk3j2hjrxezlborsv6zdjojswg5dpoj4tuicqmvzg22ltonuw63ramrsw42lfmq5cairpmrqxg2bpmrqxiyjpmrsxm3tfoqwwk5tpnzsxiljueiratlzopaaaaaaaadnta6aaaaaaaafgu5lqaaaaaaae4cymaaaaaaaapikqyaaaaaaaat62beaaaaaaabg3scaaaaaaaabqza5xbvzjaaanvsyjaaaaaaaaaa======

2020-07-23 11:13:40 PrepareShutdown: In progress...
2020-07-23 11:13:40 RenameThread: thread new name dash-shutoff
2020-07-23 11:13:40 Posix Signal: Segmentation fault
No debug information available for stacktrace. You should add debug information and then run:
dashd -printcrashinfo=bvcgc43iinzgc43ijfxgm3ybaacwiyltnbscaudponuxqictnftw4ylmhiqfgzlhnvsw45dboruw63ramzqxk3dubh3uovyaaaaaaaeq4puxbvzjaaabquj4aaaaaaaay5zqyaaaaaaabbmibqaaaaaaad65uciaaaaaaacnxeeaaaaaaaadbsb3odlssaaa3lfqsaaaaaaaaaa=

************************
Exception: type=boost::filesystem::filesystem_error, what="boost::filesystem::create_directory: Permission denied: "/dash/data/devnet-evonet-4""
No debug information available for stacktrace. You should add debug information and then run:
dashd -printcrashinfo=bvcgc43iinzgc43ijfxgm3ybaacwiyltnbsjarlymnsxa5djn5xduidupfygkplcn5xxg5b2hjtgs3dfon4xg5dfnu5duztjnrsxg6ltorsw2x3fojzg64rmeb3wqyluhurge33pon2duotgnfwgk43zon2gk3j2hjrxezlborsv6zdjojswg5dpoj4tuicqmvzg22ltonuw63ramrsw42lfmq5cairpmrqxg2bpmrqxiyjpmrsxm3tfoqwwk5tpnzsxiljueiratlzopaaaaaaaadnta6aaaaaaaafgu5lqaaaaaaae4cymaaaaaaaapikqyaaaaaaaat62beaaaaaaabg3scaaaaaaaabqza5xbvzjaaanvsyjaaaaaaaaaa======

Posix Signal: Segmentation fault
No debug information available for stacktrace. You should add debug information and then run:
dashd -printcrashinfo=bvcgc43iinzgc43ijfxgm3ybaacwiyltnbscaudponuxqictnftw4ylmhiqfgzlhnvsw45dboruw63ramzqxk3dubh3uovyaaaaaaaeq4puxbvzjaaabquj4aaaaaaaay5zqyaaaaaaabbmibqaaaaaaad65uciaaaaaaacnxeeaaaaaaaadbsb3odlssaaa3lfqsaaaaaaaaaa=
2020-07-23 11:14:41 

Not sure if related, but in system log I see repeated entries like these:

Jul 23 11:16:33 xxx systemd-udevd[14604]: link_config: could not get ethtool features for vethb9e5dd7
Jul 23 11:16:33 xxx systemd-udevd[14604]: Could not set offload features of vethb9e5dd7: No such device
...
Jul 23 11:16:33 xxx kernel: IPv6: ADDRCONF(NETDEV_UP): vethc0aa236: link is not ready
Jul 23 11:16:33 xxx kernel: br-4494172b0db4: port 1(vethc0aa236) entered blocking state
Jul 23 11:16:33 xxx kernel: br-4494172b0db4: port 1(vethc0aa236) entered forwarding state
Jul 23 11:16:33 xxx systemd-udevd[14606]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Jul 23 11:16:33 xxx systemd-udevd[14606]: Using default interface naming scheme 'v240'.
Jul 23 11:16:33 xxx systemd-udevd[14606]: Could not generate persistent MAC address for vethc0aa236: No such file or directory
Jul 23 11:16:33 xxx systemd-udevd[14604]: link_config: autonegotiation is unset or enabled, the speed and duplex are not writable.
Jul 23 11:16:33 xxx systemd-udevd[14604]: Using default interface naming scheme 'v240'.
Jul 23 11:16:33 xxx systemd-udevd[14604]: Could not generate persistent MAC address for veth0c83c78: No such file or directory

Steps to Reproduce

  1. Set up dependencies and distribution package, per README
  2. Start a masternode: mn start evonet <my-ip-addr> 9999 -p <operator-key>

Context

Although the masternode appears to be enabled and running when queried with the DashMasternodeTool, after some time status goes to POSE_BANNED

Your Environment

shumkov commented 4 years ago

Please try to do chown -R 777 data from your mn-bootstrap dir and restart node.

dkaparis commented 4 years ago

Please try to do chown -R 777 data from your mn-bootstrap dir and restart node.

I suppose you mean chmod -R 777 data

This appears to solve the issue - all containers stay up without restarting so far. Perhaps it should be added to the documentation.

shumkov commented 4 years ago

Oh, sorry, my bad.

Perhaps it should be added to the documentation.

@strophy Could you update the README, please?

strophy commented 4 years ago

I cannot reproduce this bug after 5-6 attempts following different sequences setting up from scratch. I think making the datadir globally writeable is a hack, I would rather find out how we are ending up in this state, since two separate users have come forward with this problem now. It's most likely a missing step during setup, @dkaparis can you verify you added your non-root user to the docker group and refreshed your environment as described here?

dkaparis commented 4 years ago

... @dkaparis can you verify you added your non-root user to the docker group and refreshed your environment as described here?

I ran the whole setup and do the starting from the root account, so I didn't. Maybe that's where the problem lies, if so it should be noted. As it is now, these post-installation steps are described as optional.

strophy commented 4 years ago

The post-installation steps are only optional if you do not use the CLI. The CLI must be run as a non-root user, or the container will not be able to write to the filesystem. I think I have tracked down the error, you need to make sure that the npm install command in particular is NOT run as the root user, i.e. do not use sudo. I worked through this with another user experiencing the same problem and we narrowed the bare minimum of steps for a clean install under Ubuntu 20.04 LTS, starting as root:

apt update
apt install docker.io docker-compose nodejs npm
adduser strophy
usermod -aG sudo,docker strophy
su - strophy
git clone https://github.com/dashevo/mn-bootstrap
cd mn-bootstrap
nano configs/evonet/core/dashd.conf //temporary workaround, bump evonet version to evonet-5
npm install
sudo npm link
mn start evonet <ip> 19999 -p <bls-privkey>

Can you please try this and confirm it works? Then I'll look into making the documentation more clear.

dkaparis commented 4 years ago

The post-installation steps are only optional if you do not use the CLI. The CLI must be run as a non-root user, or the container will not be able to write to the filesystem. I think I have tracked down the error, you need to make sure that the npm install command in particular is NOT run as the root user, i.e. do not use sudo. I worked through this with another user experiencing the same problem and we narrowed the bare minimum of steps for a clean install under Ubuntu 20.04 LTS, starting as root:

apt update
apt install docker.io docker-compose nodejs npm
adduser strophy
usermod -aG sudo,docker strophy
su - strophy
git clone https://github.com/dashevo/mn-bootstrap
cd mn-bootstrap
nano configs/evonet/core/dashd.conf //temporary workaround, bump evonet version to evonet-5
npm install
sudo npm link
mn start evonet <ip> 19999 -p <bls-privkey>

Can you please try this and confirm it works? Then I'll look into making the documentation more clear.

Following the above procedure from a clean install (modified for Debian 10), all containers run except for dash_masternode_evonet_drive_tendermint_1 which is again restarting.

docker logs dash_masternode_evonet_drive_tendermint_1

...
goroutine 1 [running]:
github.com/tendermint/tendermint/privval.FilePVKey.Save(0xc000f228e0, 0x14, 0x20, 0x11c4440, 0xc000f22900, 0x11c4400, 0xc0000f5c80, 0xc0002ec760, 0x1d)
    /home/archie/Documents/tendermint/privval/file.go:61 +0x182
github.com/tendermint/tendermint/privval.(*FilePV).Save(0xc0002e83c0)
    /home/archie/Documents/tendermint/privval/file.go:262 +0xa1
github.com/tendermint/tendermint/cmd/tendermint/commands.initFilesWithConfig(0xc000f04000, 0x0, 0x0)
    /home/archie/Documents/tendermint/cmd/tendermint/commands/init.go:37 +0xbf6
github.com/tendermint/tendermint/cmd/tendermint/commands.initFiles(0x187d0a0, 0x18a7a48, 0x0, 0x0, 0x0, 0x0)
    /home/archie/Documents/tendermint/cmd/tendermint/commands/init.go:23 +0x2d
github.com/spf13/cobra.(*Command).execute(0x187d0a0, 0x18a7a48, 0x0, 0x0, 0x187d0a0, 0x18a7a48)
    /home/archie/go/pkg/mod/github.com/spf13/cobra@v0.0.1/command.go:698 +0x42c
github.com/spf13/cobra.(*Command).ExecuteC(0x187e3c0, 0x2, 0xc00000cbc0, 0xf47739)
    /home/archie/go/pkg/mod/github.com/spf13/cobra@v0.0.1/command.go:783 +0x2c9
github.com/spf13/cobra.(*Command).Execute(...)
    /home/archie/go/pkg/mod/github.com/spf13/cobra@v0.0.1/command.go:736
github.com/tendermint/tendermint/libs/cli.Executor.Execute(0x187e3c0, 0x10970b8, 0x2, 0xc000135ba0)
    /home/archie/Documents/tendermint/libs/cli/setup.go:89 +0x3c
main.main()
    /home/archie/Documents/tendermint/cmd/tendermint/main.go:45 +0x248
panic: open /data/write-file-atomic-246361497058367640: permission denied
shumkov commented 4 years ago

This one is related #78

shumkov commented 4 years ago

Is it still actual, since we moved data to docker volumes?

strophy commented 4 years ago

Right, I'm fairly sure this can be closed because docker volumes handle permissions much better than bind mounting the datadir. @dkaparis please open another issue if you are still having problems.