status-im / nimbus-eth2

Nim implementation of the Ethereum Beacon Chain
https://nimbus.guide
Other
503 stars 209 forks source link

Error writting to data dir volume #3700

Open cbermudez97 opened 2 years ago

cbermudez97 commented 2 years ago

Describe the bug Im running a beacon node using docker compose while persisting the node data to a volume. When started the node fail after some errors about permissions for the data dir I used. First my data folder has 755 as permissions. When starting the node it fails with these:

cl-beacon-nimbus  | [Chronicles] Log message not delivered: [Chronicles] A writer was not configured for a dynamic log output device. Log message not delivered: WRN 2022-06-03 15:33:16.602+00:00 Data directory has insecure permissions. Correcting them. data_dir=/data current_permissions=0755 required_permissions=0700

Following the instructions there I changed my data folder permissions to 700. Running the node again shows these then:

cl-beacon-nimbus  | /home/user/nimbus-eth2/vendor/nim-libp2p/libp2p/stream/bufferstream.nim(438) NimMain
cl-beacon-nimbus  | /home/user/nimbus-eth2/beacon_chain/nimbus_beacon_node.nim(2069) main
cl-beacon-nimbus  | /home/user/nimbus-eth2/beacon_chain/nimbus_beacon_node.nim(1937) handleStartUpCmd
cl-beacon-nimbus  | /home/user/nimbus-eth2/beacon_chain/nimbus_beacon_node.nim(1690) doRunBeaconNode
cl-beacon-nimbus  | /home/user/nimbus-eth2/beacon_chain/nimbus_beacon_node.nim(1503) createPidFile
cl-beacon-nimbus  | /home/user/nimbus-eth2/vendor/nimbus-build-system/vendor/Nim/lib/system/io.nim(706) writeFile
cl-beacon-nimbus  | Error: unhandled exception: cannot open: /data/beacon_node.pid [IOError]
cl-beacon-nimbus exited with code 1

To Reproduce

  1. Create docker-compose.yml with:
    version: '3.9'
    services:
    beacon:
    stop_grace_period: 30s
    container_name: cl-beacon-nimbus
    restart: unless-stopped
    image: statusim/nimbus-eth2:amd64-latest
    volumes:
      - ./data:/data         
      - ./jwtsecret:/tmp/jwt/jwtsecret
    ports:
      - 9000:9000/tcp
      - 9000:9000/udp
      - 5054:5054/tcp
    expose:
      - 5051
    command:
      - --network=merge-testnets/kiln
      - --data-dir=/data
      - --tcp-port=9000
      - --udp-port=9000
      - --web3-url=ws://127.0.0.1:8151
      - --max-peers=50
      - --rest
      - --rest-address=0.0.0.0
      - --rest-port=5051
      - --rest-allow-origin=*
      - --metrics
      - --metrics-address=0.0.0.0
      - --metrics-port=5054
      - --jwt-secret="/tmp/jwt/jwtsecret"
      - --terminal-total-difficulty-override=100000000000000000000000
    logging:
      driver: "json-file"
      options:
        max-size: "10m"
        max-file: "10"
  2. Run openssl rand -hex 32 > jwtsecret
  3. Create ./data and change its permissions to 755
  4. Run docker compose up beacon
  5. Change ./data permissions to 700
  6. Run docker compose up beacon

Additional All the above commands were used as root.

mugiwara-pirate commented 12 months ago

I can reproduce this issue. More docs are needed about running in docker.

zah commented 12 months ago

Besides having the right permissions, the data dir should be owned by the same used ID that is used by the docker process. The correct usage of user IDs within docker is a somewhat complicated topic which is explored in the following article:

https://medium.com/@mccode/understanding-how-uid-and-gid-work-in-docker-containers-c37a01d01cf

mugiwara-pirate commented 12 months ago

@zah Thanks for the link. Indeed, executing the following command makes the error goes away in my case:

sudo chown 1000:1000 -R <MY_HOST_DATA_DIR>

But perhaps some updates can be done so that such a manual configuration is not necessary for users.

mratsim commented 5 months ago

Hit by this as well. But I'm trying to run Nimbus rootless in Podman (https://github.com/containers/podman/blob/main/docs/tutorials/rootless_tutorial.md)

I have a suspicion that having the binary built as user as in there is the cause of those woes: https://github.com/status-im/nimbus-eth2/blob/7c731a2bfb4820ef5c08e5e35df635b986ed4857/docker/dist/Dockerfile.amd64#L6C1-L19

Instead of putting the binaries in /home/user they likely can be put in /usr/local/bin, and we can remove a dependency on this user user.

Some references:

Podman CLI commands

requires mapping a volume to /home/node:

podman pod create \
    --name taiko-a6-katla \
    --volume $HOME/pod-data/taiko-a6-katla:/home/node

Lighthouse (working)

podman run -dt \
  --pod taiko-a6-katla \
  --name tko-a6-l1-cl-lighthouse \
    docker.io/sigp/lighthouse:latest-modern \
      lighthouse bn \
        --datadir /home/node/l1-cl/lighthouse \
        --network holesky \
        --execution-endpoint http://localhost:8551 \
        --execution-jwt /home/node/jwtsecret \
        --http \
        --http-address 0.0.0.0 \
        --metrics \
        --metrics-address 0.0.0.0 \
        --checkpoint-sync-url https://checkpoint-sync.holesky.ethpandaops.io

Nimbus

podman run -dt \
  --pod taiko-a6-katla \
  --name tko-a6-l1-cl-nimbus-checkpoint-sync \
    docker.io/statusim/nimbus-eth2:amd64-latest \
        trustedNodeSync \
        --data-dir=/home/node/l1-cl/nimbus/beacon_node \
        --network=holesky \
        --non-interactive \
        --web3-url=http://localhost:8551 \
        --with-deposit-snapshot \
        --backfill=false \
        --trusted-node-url=http://testing.holesky.beacon-api.nimbus.team