hyperledger / besu

An enterprise-grade Java-based, Apache 2.0 licensed Ethereum client https://wiki.hyperledger.org/display/besu
https://www.hyperledger.org/projects/besu
Apache License 2.0
1.51k stars 833 forks source link

Unable to start Hyperledger Besu node on Openshift with error: Nat manager failed to configure itself automatically due to the following reason : #6724

Open yanosh1982 opened 8 months ago

yanosh1982 commented 8 months ago

I'm trying to deploy a Hyperledger Besu node from inside an Openshift / Kubernetes cluster. For test purposes I'm deploying It as a ReplicaSet, but i will switch to StatefulSet when it will be up, running and synchronizing with other peers (which are running outside the cluster).

When the POD starts prints the following:

2024-03-13 16:05:19.397+00:00 | main | INFO | Runner | Starting Ethereum main loop ...
2024-03-13 16:05:19.398+00:00 | main | INFO | KubernetesNatManager | Starting kubernetes NAT manager.
...
2024-03-13 16:05:22.007+00:00 | main | DEBUG | NatService | Nat manager failed to configure itself automatically due to the following reason : . NONE mode will be used

If I switch to nat-method NONE how can I determine the enode address advertised?

Thank you in advance

My K8S manifests:

RBAC

  ---
  apiVersion: v1
  kind: ServiceAccount
  metadata:
    name: rpc-sa
    namespace: node-rpc-lab

  ---
  apiVersion: rbac.authorization.k8s.io/v1
  kind: Role
  metadata:
    name: rpc-service-read-role
    namespace: node-rpc-lab
  rules:
  - apiGroups: [""]
    resources: ["services"]
    verbs: ["get", "list"]
  ---
  apiVersion: rbac.authorization.k8s.io/v1
  kind: RoleBinding
  metadata:
    name: rpc-rb
    namespace: node-rpc-lab
  roleRef:
    apiGroup: rbac.authorization.k8s.io
    kind: Role
    name: rpc-service-read-role
  subjects:
  - kind: ServiceAccount
    name: rpc-sa
    namespace: node-rpc-lab

Service

  apiVersion: v1
  kind: Service
  metadata:
    name: besu-rpc-cluter-ip
    labels:
      app: besu-rpc-cluster-ip
  spec:
    type: ClusterIP
    selector:
      app: besu-rpc
    ports:
      - port: 8545
        targetPort: 8545
        protocol: TCP
        name: json-rpc
      - port: 8546
        targetPort: 8546
        protocol: TCP
        name: ws
      - port: 9545
        targetPort: 9545
        protocol: TCP
        name: metrics
      - port: 30303
        targetPort: 30303
        protocol: TCP
        name: rlpx
      - port: 30303
        targetPort: 30303
        protocol: UDP
        name: discovery

Genesis config map

apiVersion: v1
kind: ConfigMap
metadata:
  name: genesis-config-map
data:
  genesis.json: |
    ...

Config toml config map

  apiVersion: v1
  kind: ConfigMap
  metadata:
    name: node-config-map
  data:
    config.toml: |
      logging="DEBUG"
      data-path="/opt/besu/data"
      host-whitelist=["*"]
      genesis-file="/configs/genesis/genesis.json"
      node-private-key-file="/opt/besu/keys/node.key"

      #bootnodes
      bootnodes=["enode://pubkey@ip-on-premise-bootnode:30303"] <- node outside K8S cluster

      # rpc
      rpc-http-enabled=true
      rpc-http-host="0.0.0.0"
      rpc-http-port=8545
      rpc-http-cors-origins=["*"]

      # ws
      rpc-ws-enabled=true
      rpc-ws-host="0.0.0.0"
      rpc-ws-port=8546

      # metrics
      metrics-enabled=true
      metrics-host="0.0.0.0"
      metrics-port=9545

      #nat e p2p
      p2p-port=30303

      min-gas-price=0

      rpc-http-api=["WEB3","ETH","NET","IBFT","ADMIN","PERM"]
      rpc-ws-api=["WEB3","ETH","NET","IBFT","ADMIN","PERM"]

      permissions-accounts-contract-address="0x0000000000000000000000000000000000008888"
      permissions-nodes-contract-address="0x0000000000000000000000000000000000009999"

      permissions-nodes-contract-enabled=true
      permissions-accounts-contract-enabled=true
      permissions-nodes-contract-version=2

Deployment

  apiVersion: apps/v1
  kind: Deployment
  metadata:
    name: besu-rpc
    labels:
      app: besu-rpc
  spec:
    replicas: 1
    selector:
      matchLabels:
        app: besu-rpc
    template:
      metadata:
        labels:
          app: besu-rpc
      spec:
        serviceAccountName: rpc-sa
        containers:
        - name: besu-rpc
          image: hyperledger-besu:22.4.0
          imagePullPolicy: IfNotPresent
          resources:
            requests:
              cpu: 100m
              memory: 1024Mi
            limits:
              cpu: 500m
              memory: 2048Mi
          env:
            - name: POD_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
            - name: POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
          ports:
            - containerPort: 8545
              name: json-rpc
              protocol: TCP
            - containerPort: 8546
              name: ws
              protocol: TCP
            - containerPort: 30303
              name: rlpx
              protocol: TCP
            - containerPort: 30303
              name: discovery
              protocol: UDP
          volumeMounts:
            - name: genesis-config-volume
              mountPath: /configs/genesis
            - name: node-config-volume
              mountPath: /configs/node
            - name: data
              mountPath: /opt/besu/data
            - name: key
              mountPath: /opt/besu/keys
          command:
            - /bin/sh
            - -c
          args:
            - |
              exec /opt/besu/bin/besu \
                --config-file=/configs/node/config.toml \
                --nat-method=KUBERNETES --Xdns-enabled=true --Xdns-update-enabled=true --Xnat-kube-service-name=besu-rpc-cluter-ip \
          livenessProbe:
            httpGet:
              path: /liveness
              port: 8545
          readinessProbe:
            httpGet:
              path: /liveness
              port: 8545
        volumes:
          - name: genesis-config-volume
            configMap:
              name: genesis-config-map
          - name: node-config-volume
            configMap:
              name: node-config-map
          - name: data
            emptyDir:
              sizeLimit: "1Gi"
          - name: key
            emptyDir:
              sizeLimit: "1Gi"
yanosh1982 commented 7 months ago

Changing the image version to 23.10.2 I obtain the following error:

2024-03-14 10:27:26.178+00:00 | main | INFO | KubernetesNatManager | Starting kubernetes NAT manager.
2024-03-14 10:27:26.191+00:00 | main | DEBUG | KubernetesNatManager | Trying to update information using Kubernetes client SDK.
...
2024-03-14 10:27:28.688+00:00 | main | WARN | NatService | Nat manager failed to configure itself automatically due to the following reason : org.hyperledger.besu.nat.core.exception.NatInitializationException: . NONE mode will be used as a fallback (set --Xnat-method-fallback-enabled=false to disable)
garyschulte commented 7 months ago

does rpc-service-read-role have permissions to list all services in all namespaces? Currently besu's k8s nat manager requires that permission, at least until #6088 is merged to allow restricting the scan to a single namespace

yanosh1982 commented 7 months ago

@garyschulte no, the rpc-service-read-role is a simple Role, not a ClusterRole as this is a security requirement in my company.

garyschulte commented 7 months ago

@garyschulte no, the rpc-service-read-role is a simple Role, not a ClusterRole as this is a security requirement in my company.

Since it is a test rig, perhaps promote the role to test and let us know if that solves your problem.

Since it has gone dormant, I will dust off the k8s namespace PR and see about getting it merged.

yanosh1982 commented 7 months ago

Since it has gone dormant, I will dust off the k8s namespace PR and see about getting it merged.

@garyschulte That would be great, I'll try to promote the role and let You know

yanosh1982 commented 7 months ago

We created a ClusterRole and RoleBinding to get the service list, and mounted the relative service account on the pod but nothing changed.

2024-03-21 11:36:05.693+00:00 | main | WARN | NatService | Nat manager failed to configure itself automatically due to the following reason : org.hyperledger.besu.nat.core.exception.NatInitializationException: . NONE mode will be used as a fallback (set --Xnat-method-fallback-enabled=false to disable)
2024-03-21 11:36:05.694+00:00 | main | INFO | NetworkRunner | Starting Network.

Following the RBAC Configuration:

---
apiVersion: v1
kind: ServiceAccount
metadata:
  name: rpc-cluster-sa
  namespace: node-ibsi-rpc-lab

---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: rpc-service-read-cluster-role
  namespace: node-ibsi-rpc-lab
rules:
- apiGroups: [""]
  resources: ["secrets"]
  resourceNames: [ besu-rpc-private-key ]
  verbs: ["get"]
- apiGroups: [""]
  resources: ["services"]
  verbs: ["get", "list"]
---
apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: rpc-rb
  namespace: node-ibsi-rpc-lab
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: rpc-service-read-cluster-role
subjects:
- kind: ServiceAccount
  name: rpc-sa
  namespace: node-ibsi-rpc-lab
yanosh1982 commented 7 months ago

@garyschulte Trying to solve the problem locally (excluding the K8S cluster) I created a whole local network with a docker compose file, but every node of the network is not syncing. Maybe solving this I'll be able to solve the problem in K8S too. Can you try to reproduce this environment and try to understand what is not working?

I generated the key pairs with besu operator. One question: what happens if the genesis file provided at key pairs generation time is different from the genesis file provided to the nodes when starting besu?

Thank you

Docker compose manifest

version: "3.3"
services:
  validator1:
    container_name: validator-1
    image: hyperledger/besu:23.4.0
    command: [
      "--genesis-file=/configs/genesis.json", 
      "--node-private-key-file=/run/secrets/validator_1_private_key", 
      "--host-whitelist=${NODES_HOST_WHITELIST}",
      "--p2p-host=172.10.1.5",
      "--data-path=/data",
      "--min-gas-price=0",
      "--bootnodes=enode://${VALIDATOR1_PUBKEY}@172.10.1.5:30303,enode://${VALIDATOR2_PUBKEY}@172.10.1.2:30303,enode://${VALIDATOR3_PUBKEY}@172.10.1.3:30303,enode://${VALIDATOR4_PUBKEY}@172.10.1.4:30303",
      "--rpc-http-enabled",
      "--rpc-http-api=ETH,IBFT"]
    volumes:
      - ./node-data/node-1:/data
      - ./genesis.json:/configs/genesis.json
    secrets:
      - validator_1_private_key
    networks:
      localbesu:
        ipv4_address: 172.10.1.5
  validator2:
    container_name: validator-2
    image: hyperledger/besu:23.4.0
    command: [
      "--genesis-file=/configs/genesis.json", 
      "--node-private-key-file=/run/secrets/validator_2_private_key", 
      "--host-whitelist=${NODES_HOST_WHITELIST}",
      "--p2p-host=172.10.1.2",
      "--data-path=/data",
      "--min-gas-price=0",
      "--bootnodes=enode://${VALIDATOR1_PUBKEY}@172.10.1.5:30303,enode://${VALIDATOR2_PUBKEY}@172.10.1.2:30303,enode://${VALIDATOR3_PUBKEY}@172.10.1.3:30303,enode://${VALIDATOR4_PUBKEY}@172.10.1.4:30303",
      "--rpc-http-enabled",
      "--rpc-http-api=ETH,IBFT"]
    volumes:
      - ./node-data/node-2:/data
      - ./genesis.json:/configs/genesis.json
    secrets:
      - validator_2_private_key
    networks:
      localbesu:
        ipv4_address: 172.10.1.2
  validator3:
    container_name: validator-3
    image: hyperledger/besu:23.4.0
    command: [
      "--genesis-file=/configs/genesis.json", 
      "--node-private-key-file=/run/secrets/validator_3_private_key", 
      "--host-whitelist=${NODES_HOST_WHITELIST}",
      "--p2p-host=172.10.1.3",
      "--data-path=/data",
      "--min-gas-price=0",
      "--bootnodes=enode://${VALIDATOR1_PUBKEY}@172.10.1.5:30303,enode://${VALIDATOR2_PUBKEY}@172.10.1.2:30303,enode://${VALIDATOR3_PUBKEY}@172.10.1.3:30303,enode://${VALIDATOR4_PUBKEY}@172.10.1.4:30303",
      "--rpc-http-enabled",
      "--rpc-http-api=ETH,IBFT"]
    volumes:
      - ./node-data/node-3:/data
      - ./genesis.json:/configs/genesis.json
    secrets:
      - validator_3_private_key
    networks:
      localbesu:
        ipv4_address: 172.10.1.3
  validator4:
    container_name: validator-4
    image: hyperledger/besu:23.4.0
    command: [
      "--genesis-file=/configs/genesis.json", 
      "--node-private-key-file=/run/secrets/validator_4_private_key", 
      "--host-whitelist=${NODES_HOST_WHITELIST}",
      "--p2p-host=172.10.1.4",
      "--data-path=/data",
      "--min-gas-price=0",
      "--bootnodes=enode://${VALIDATOR1_PUBKEY}@172.10.1.5:30303,enode://${VALIDATOR2_PUBKEY}@172.10.1.2:30303,enode://${VALIDATOR3_PUBKEY}@172.10.1.3:30303,enode://${VALIDATOR4_PUBKEY}@172.10.1.4:30303",
      "--rpc-http-enabled",
      "--rpc-http-api=ETH,IBFT"]
    volumes:
      - ./node-data/node-4:/data
      - ./genesis.json:/configs/genesis.json
    secrets:
      - validator_4_private_key
    networks:
      localbesu:
        ipv4_address: 172.10.1.4
  rpc:
    container_name: rpc-1
    image: hyperledger/besu:23.4.0
    command: [
      "--genesis-file=/configs/genesis.json", 
      "--host-whitelist=${NODES_HOST_WHITELIST}",
      "--p2p-host=172.10.1.6",
      "--data-path=/data",
      "--min-gas-price=0",
      "--bootnodes=enode://${VALIDATOR1_PUBKEY}@172.10.1.5:30303,enode://${VALIDATOR2_PUBKEY}@172.10.1.2:30303,enode://${VALIDATOR3_PUBKEY}@172.10.1.3:30303,enode://${VALIDATOR4_PUBKEY}@172.10.1.4:30303",
      "--rpc-http-enabled",
      "--rpc-http-cors-origins=${NODES_HTTP_CORS_ORIGINS}",
      "--rpc-http-api=ETH,NET,IBFT,ADMIN,WEB3,TXPOOL,TRACE,DEBUG",
      "--rpc-http-host=172.10.1.6",
      "--rpc-ws-enabled=true",
      "--rpc-ws-host=172.10.1.6"]
    volumes:
      - ./node-data/rpc-1:/data
      - ./genesis.json:/configs/genesis.json
    networks:
      localbesu:
        ipv4_address: 172.10.1.6
    ports:
      - 8545:8545
      - 8546:8546

secrets:
  validator_1_private_key:
    file: ./networkFiles/keys/0x8c657a7c7573910910c00d06c369a6aa512b2be5/key
  validator_2_private_key:
    file: ./networkFiles/keys/0x593e5c8b4363698afdd0894d76d4b127cfebeb74/key
  validator_3_private_key:
    file: ./networkFiles/keys/0x0702891d7efd102c7f657d1bd63a602933ebf8b7/key
  validator_4_private_key:
    file: ./networkFiles/keys/0xadcbdee3161061abbd3cfcc3c3664bcd7a1ceca0/key
networks:
  localbesu:
    driver: bridge
    ipam:
      config:
        - subnet: 172.10.1.0/28

genesis.json

{
  "config": {
    "chainid": 1337,
    "londonBlock": 0,
    "contractSizeLimit": 2147483647,
    "zeroBaseFee": true,
    "qbft": {
      "epochlength": 30000,
      "blockperiodseconds": 5,
      "requesttimeoutseconds": 10
    }
  },
  "nonce": "0x0",
  "timestamp" : "0x0",
  "extraData": "0xf87aa00000000000000000000000000000000000000000000000000000000000000000f8549464a702e6263b7297a96638cac6ae65e6541f4169943923390ad55e90c237593b3b0e401f3b08a0318594aefdb9a738c9f433e5b6b212a6d62f6370c2f69294c7eeb9a4e00ce683cf93039b212648e01c6c6b78c080c0",
  "gasLimit": "0x1fffffffffffff",
  "difficulty" : "0x1",
  "mixHash": "0x63746963616c2062797a616e74696e65206661756c7420746f6c6572616e6365",
  "coinbase": "0x0000000000000000000000000000000000000000"
}

env file sibling of docker-compose.yaml

VALIDATOR1_PUBKEY=<write here validator1 pub key>
VALIDATOR2_PUBKEY=<write here validator1 pub key>
VALIDATOR3_PUBKEY=<write here validator1 pub key>
VALIDATOR4_PUBKEY=<write here validator1 pub key>
NODES_HOST_WHITELIST=*
NODES_HTTP_CORS_ORIGINS=*
siladu commented 7 months ago

@yanosh1982

One question: what happens if the genesis file provided at key pairs generation time is different from the genesis file provided to the nodes when starting besu?

QBFT requires that the genesis file's extraData contains the encoded node addresses of your initial validator set. Depending on your key gen methodology, you might be generating this extraData as part of that step. So if your extraData doesn't include the correct validator addresses (ultimately derived from the generated keys) that could explain not syncing.

The extraData in the genesis you provided is 0xf87aa00000000000000000000000000000000000000000000000000000000000000000f8549464a702e6263b7297a96638cac6ae65e6541f4169943923390ad55e90c237593b3b0e401f3b08a0318594aefdb9a738c9f433e5b6b212a6d62f6370c2f69294c7eeb9a4e00ce683cf93039b212648e01c6c6b78c080c0 which is RLP encoded and (using https://toolkit.abdk.consulting/ethereum#rlp) decodes to: ["0x0000000000000000000000000000000000000000000000000000000000000000",["0x64a702e6263b7297a96638cac6ae65e6541f4169","0x3923390ad55e90c237593b3b0e401f3b08a03185","0xaefdb9a738c9f433e5b6b212a6d62f6370c2f692","0xc7eeb9a4e00ce683cf93039b212648e01c6c6b78"],[],"0x",[]]

Does this list match your actually deployed validator addresses? ["0x64a702e6263b7297a96638cac6ae65e6541f4169","0x3923390ad55e90c237593b3b0e401f3b08a03185","0xaefdb9a738c9f433e5b6b212a6d62f6370c2f692","0xc7eeb9a4e00ce683cf93039b212648e01c6c6b78"]

Btw, your rpc config contains IBFT instead of QBFT, e.g. "--rpc-http-api=ETH,NET,IBFT,ADMIN,WEB3,TXPOOL,TRACE,DEBUG", so you won't have access to the QBFT specific rpcs

Finally, from the k8s side, are you familiar with this tutorial? https://besu.hyperledger.org/private-networks/tutorials/kubernetes

yanosh1982 commented 7 months ago

@yanosh1982

One question: what happens if the genesis file provided at key pairs generation time is different from the genesis file provided to the nodes when starting besu?

QBFT requires that the genesis file's extraData contains the encoded node addresses of your initial validator set. Depending on your key gen methodology, you might be generating this extraData as part of that step. So if your extraData doesn't include the correct validator addresses (ultimately derived from the generated keys) that could explain not syncing.

The extraData in the genesis you provided is 0xf87aa00000000000000000000000000000000000000000000000000000000000000000f8549464a702e6263b7297a96638cac6ae65e6541f4169943923390ad55e90c237593b3b0e401f3b08a0318594aefdb9a738c9f433e5b6b212a6d62f6370c2f69294c7eeb9a4e00ce683cf93039b212648e01c6c6b78c080c0 which is RLP encoded and (using https://toolkit.abdk.consulting/ethereum#rlp) decodes to: ["0x0000000000000000000000000000000000000000000000000000000000000000",["0x64a702e6263b7297a96638cac6ae65e6541f4169","0x3923390ad55e90c237593b3b0e401f3b08a03185","0xaefdb9a738c9f433e5b6b212a6d62f6370c2f692","0xc7eeb9a4e00ce683cf93039b212648e01c6c6b78"],[],"0x",[]]

Does this list match your actually deployed validator addresses? ["0x64a702e6263b7297a96638cac6ae65e6541f4169","0x3923390ad55e90c237593b3b0e401f3b08a03185","0xaefdb9a738c9f433e5b6b212a6d62f6370c2f692","0xc7eeb9a4e00ce683cf93039b212648e01c6c6b78"]

Btw, your rpc config contains IBFT instead of QBFT, e.g. "--rpc-http-api=ETH,NET,IBFT,ADMIN,WEB3,TXPOOL,TRACE,DEBUG", so you won't have access to the QBFT specific rpcs

Finally, from the k8s side, are you familiar with this tutorial? https://besu.hyperledger.org/private-networks/tutorials/kubernetes

@siladu Thank you for your suggestions, the list of validator nodes address doesn't match with my actual validator pool addresses. I'll fix it and try again.

Regarding the K8S tutorial I've started to familiarize with the tutorial deploying on a Kubernetes cluster one single node with nat-method NONE in a test environment. I will read carefully the article you posted to find the best way to deploy nodes.