paritytech / polkadot-sdk

The Parity Polkadot Blockchain SDK
https://polkadot.network/
1.87k stars 685 forks source link

Polkadot > 1.3.0 cannot unshare user namespace permission denied #3018

Closed yannickhilber closed 9 months ago

yannickhilber commented 9 months ago

After upgrading from v1.3.0 to v.1.6.0 the container exit 1 because of permission denied on unsharing user and mount namespaces.

Starting the container with --insecure-validator-i-know-what-i-do allows to bypass the error.

Kernel version on ubuntu 20.04:

root@validator59:/etc/docker/compose/polkadot01# uname -r
5.4.0-144-generic

Docker version:

root@validator59:/etc/docker/compose/polkadot01# docker --version
Docker version 23.0.4, build f480fb1

Container log:

polkadot01    | 2024-01-22 14:33:58 Parity Polkadot
polkadot01    | 2024-01-22 14:33:58 ✌️  version 1.6.0-680adc78ab3
polkadot01    | 2024-01-22 14:33:58 ❤️  by Parity Technologies <admin@parity.io>, 2017-2024
polkadot01    | 2024-01-22 14:33:58 📋 Chain specification: Polkadot
polkadot01    | 2024-01-22 14:33:58 🏷  Node name: polkadot01
polkadot01    | 2024-01-22 14:33:58 👤 Role: AUTHORITY
polkadot01    | 2024-01-22 14:33:58 💾 Database: RocksDb at /polkadot/.local/share/polkadot/chains/polkadot/db/full
polkadot01    | 2024-01-22 14:34:01 Can't use fast sync mode with a partially synced database. Reverting to full sync mode.
polkadot01    | 2024-01-22 14:34:01 🏷  Local node identity is: 12D3KooWSJHpn2rhDKfSuVVjaAuPhAyaejML9gALxbqV4AzQu1RU
polkadot01    | 2024-01-22 14:34:01 🚀 Using prepare-worker binary at: "/usr/lib/polkadot/polkadot-prepare-worker"
polkadot01    | 2024-01-22 14:34:01 🚀 Using execute-worker binary at: "/usr/lib/polkadot/polkadot-execute-worker"
polkadot01    | 2024-01-22 14:34:01 💻 Operating system: linux
polkadot01    | 2024-01-22 14:34:01 💻 CPU architecture: x86_64
polkadot01    | 2024-01-22 14:34:01 💻 Target environment: gnu
polkadot01    | 2024-01-22 14:34:01 💻 CPU: AMD EPYC 7402 24-Core Processor
polkadot01    | 2024-01-22 14:34:01 💻 CPU cores: 48
polkadot01    | 2024-01-22 14:34:01 💻 Memory: 128707MB
polkadot01    | 2024-01-22 14:34:01 💻 Kernel: 5.4.0-144-generic
polkadot01    | 2024-01-22 14:34:01 💻 Linux distribution: Ubuntu 22.04.3 LTS
polkadot01    | 2024-01-22 14:34:01 💻 Virtual machine: no
polkadot01    | 2024-01-22 14:34:01 📦 Highest known block at #19153649
polkadot01    | 2024-01-22 14:34:01 〽️ Prometheus exporter started at 0.0.0.0:9615
polkadot01    | 2024-01-22 14:34:01 Running JSON-RPC server: addr=0.0.0.0:9944, allowed origins=["*"]
polkadot01    | 2024-01-22 14:34:01 🏁 CPU score: 764.86 MiBs
polkadot01    | 2024-01-22 14:34:01 🏁 Memory score: 15.07 GiBs
polkadot01    | 2024-01-22 14:34:01 🏁 Disk score (seq. writes): 940.07 MiBs
polkadot01    | 2024-01-22 14:34:01 🏁 Disk score (rand. writes): 431.48 MiBs
polkadot01    | https://wiki.polkadot.network/docs/maintain-guides-how-to-validate-polkadot#reference-hardware
polkadot01    | 2024-01-22 14:34:01 👶 Starting BABE Authorship worker
polkadot01    | 2024-01-22 14:34:01 🥩 BEEFY gadget waiting for BEEFY pallet to become available...
polkadot01    | 2024-01-22 14:34:01 🚨 Your system cannot securely run a validator.
polkadot01    | Running validation of malicious PVF code has a higher risk of compromising this machine.
polkadot01    |   - Cannot enable landlock, a Linux 5.13+ kernel security feature: not available: Could not fully enable: NotEnforced
polkadot01    |   - Cannot unshare user namespace and change root, which are Linux-specific kernel security features: not available: unshare user and mount namespaces: Operation not permitted (os error 1)
polkadot01    | You can ignore this error with the `--insecure-validator-i-know-what-i-do` command line argument if you understand and accept the risks of running insecurely. With this flag, security features are enabled on a best-effort basis, but not mandatory.
polkadot01    | More information: https://wiki.polkadot.network/docs/maintain-guides-secure-validator#secure-validator-mode
polkadot01    | 2024-01-22 14:34:01 subsystem exited with error subsystem="candidate-validation" err=FromOrigin { origin: "candidate-validation", source: Context("could not enable Secure Validator Mode; check logs") }
polkadot01    | 2024-01-22 14:34:01 Terminating due to subsystem exit subsystem="candidate-validation"
polkadot01    | 2024-01-22 14:34:01 subsystem finished unexpectedly subsystem=Ok(())
polkadot01    | 2024-01-22 14:34:01 Terminating due to subsystem exit subsystem="candidate-backing"
polkadot01    | 2024-01-22 14:34:01 Terminating due to subsystem exit subsystem="statement-distribution"
polkadot01    | 2024-01-22 14:34:01 Terminating due to subsystem exit subsystem="availability-distribution"
polkadot01    | 2024-01-22 14:34:01 Terminating due to subsystem exit subsystem="bitfield-signing"
polkadot01    | 2024-01-22 14:34:01 Terminating due to subsystem exit subsystem="provisioner"
polkadot01    | 2024-01-22 14:34:01 Terminating due to subsystem exit subsystem="runtime-api"
polkadot01    | 2024-01-22 14:34:01 Terminating due to subsystem exit subsystem="network-bridge-rx"
polkadot01    | 2024-01-22 14:34:01 Terminating due to subsystem exit subsystem="network-bridge-tx"
polkadot01    | 2024-01-22 14:34:01 Terminating due to subsystem exit subsystem="chain-api"
polkadot01    | 2024-01-22 14:34:01 Terminating due to subsystem exit subsystem="collator-protocol"
polkadot01    | 2024-01-22 14:34:01 Terminating due to subsystem exit subsystem="collation-generation"
polkadot01    | 2024-01-22 14:34:01 Terminating due to subsystem exit subsystem="availability-recovery"
polkadot01    | 2024-01-22 14:34:01 Terminating due to subsystem exit subsystem="approval-distribution"
polkadot01    | 2024-01-22 14:34:01 Terminating due to subsystem exit subsystem="dispute-distribution"
polkadot01    | 2024-01-22 14:34:01 received `Conclude` signal, exiting
polkadot01    | 2024-01-22 14:34:01 Terminating due to subsystem exit subsystem="gossip-support"
polkadot01    | 2024-01-22 14:34:01 Received `Conclude` signal, exiting
polkadot01    | 2024-01-22 14:34:01 Terminating due to subsystem exit subsystem="availability-store"
polkadot01    | 2024-01-22 14:34:01 Terminating due to subsystem exit subsystem="pvf-checker"
polkadot01    | 2024-01-22 14:34:01 received `Conclude` signal, exiting
polkadot01    | 2024-01-22 14:34:01 Terminating due to subsystem exit subsystem="chain-selection"
polkadot01    | 2024-01-22 14:34:01 Conclude
polkadot01    | 2024-01-22 14:34:01 Terminating due to subsystem exit subsystem="bitfield-distribution"
polkadot01    | 2024-01-22 14:34:01 Terminating due to subsystem exit subsystem="approval-voting"
polkadot01    | 2024-01-22 14:34:01 Sending fatal alert BadCertificate
polkadot01    | 2024-01-22 14:34:01 Sending fatal alert BadCertificate
polkadot01    | 2024-01-22 14:34:01 Sending fatal alert BadCertificate
polkadot01    | 2024-01-22 14:34:01 Sending fatal alert BadCertificate
polkadot01    | 2024-01-22 14:34:01 🔍 Discovered new external address for our node: /ip4/94.75.202.214/tcp/30333/p2p/12D3KooWSJHpn2rhDKfSuVVjaAuPhAyaejML9gALxbqV4AzQu1RU 
polkadot01    | 2024-01-22 14:34:02 subsystem exited with error subsystem="prospective-parachains" err=FromOrigin { origin: "prospective-parachains", source: SubsystemReceive(Generated(Context("Signal channel is terminated and empty."))) }
polkadot01    | 2024-01-22 14:34:02 Essential task `overseer` failed. Shutting down service.
polkadot01    | 2024-01-22 14:34:02 subsystem exited with error subsystem="dispute-coordinator" err=FromOrigin { origin: "dispute-coordinator", source: SubsystemReceive(Generated(Context("Signal channel is terminated and empty."))) }
polkadot01    | 2024-01-22 14:34:02 GRANDPA voter error: safety invariant has been violated: `voter_commands_rx` was closed.
polkadot01    | 2024-01-22 14:34:02 Essential task `grandpa-voter` failed. Shutting down service.
polkadot01    | Error:
polkadot01    |    0: Other: Essential task failed.
polkadot01    |
polkadot01    |   ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ BACKTRACE ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
polkadot01    |                                  ⋮ 1 frame hidden ⋮
polkadot01    |    2: polkadot::main::h470d9519eab843f0
polkadot01    |       at <unknown source file>:<unknown line>
polkadot01    |    3: std::sys_common::backtrace::__rust_begin_short_backtrace::hd66affa3c73dfe2b
polkadot01    |       at <unknown source file>:<unknown line>
polkadot01    |    4: main<unknown>
polkadot01    |       at <unknown source file>:<unknown line>
polkadot01    |    5: __libc_start_main<unknown>
polkadot01    |       at <unknown source file>:<unknown line>
polkadot01    |    6: _start<unknown>
polkadot01    |       at <unknown source file>:<unknown line>
polkadot01    |
polkadot01    | Run with COLORBT_SHOW_HIDDEN=1 environment variable to disable frame filtering.
polkadot01    | Run with RUST_BACKTRACE=full to include source snippets.
polkadot01 exited with code 1
bkchr commented 9 months ago

CC @mrcnski

mrcnski commented 9 months ago

Hey @yannickhilber, thanks for the report. I guess you were getting a warning on v1.3.0, it just was not a hard error yet like it is now. If you have the old logs you could double-check that; if the warning was not there, it means it's probably a regression introduced in the recent versions.

But this seems expected, since containers may prevent some of our security features. We have landlock as an alternative filesystem protection to unshare/pivot_root, but it's only available on Linux 5.13+. If possible, I'd suggest upgrading the Linux version, otherwise you can continue running with the --insecure-validator-i-know-what-i-do flag.

I would suggest the first option myself, because while containers provide some security, container escapes are rather common. The risk is malicious code stealing your keys. But it's up to you to decide whether you accept the risk, and when running with this flag you will still see warnings to alert you to other missing security features.

yannickhilber commented 9 months ago

I guess you were getting a warning on v1.3.0, it just was not a hard error yet like it is now. If you have the old logs you could double-check that; if the warning was not there, it means it's probably a regression introduced in the recent versions.

I don't remember about that warning on v1.3.0 so I can't tell you. I don't have access to the old logs, unfortunately.

it's only available on Linux 5.13+. If possible, I'd suggest upgrading the Linux version, otherwise you can continue running with the --insecure-validator-i-know-what-i-do flag.

This solution seems achievable, we need to work on our fleet to upgrade to 5.13 !

Thanks for your fast support.