request help: bad member ID arg (strconv.ParseUint: parsing "": invalid syntax), expecting ID in Hex

zhangyihong commented 2 years ago

k8s版本: aws 宁夏区 eks kubectl version --short Client Version: v1.19.6-eks-49a6c0 Server Version: v1.20.15-eks-0d102a7

apisix 版本：

$ helm list -n ingress-apisix
NAME    NAMESPACE       REVISION    UPDATED                                 STATUS      CHART           APP VERSION
apisix  ingress-apisix  1           2022-05-18 16:17:51.725532 +0800 CST    deployed    apisix-0.9.3    2.13.1

apache/apisix:2.13.1-alpine apache/apisix-ingress-controller:1.4.1 docker.io/bitnami/etcd:3.4.18-debian-10-r14

pod:

$ kubectl get po -o wide -n ingress-apisix
NAME                                        READY   STATUS    RESTARTS   AGE   IP               NODE                                                NOMINATED NODE   READINESS GATES
apisix-78f5cb9b84-82czl                     1/1     Running   0          16h   172.19.115.235   ip-172-19-113-187.cn-northwest-1.compute.internal   <none>           <none>
apisix-78f5cb9b84-qczbv                     1/1     Running   0          16h   172.19.117.215   ip-172-19-112-190.cn-northwest-1.compute.internal   <none>           <none>
apisix-dashboard-5465fb4df7-9z8xg           1/1     Running   4          16h   172.19.116.190   ip-172-19-124-247.cn-northwest-1.compute.internal   <none>           <none>
apisix-etcd-0                               1/1     Running   0          16h   172.19.125.197   ip-172-19-119-174.cn-northwest-1.compute.internal   <none>           <none>
apisix-etcd-1                               1/1     Running   0          16h   172.19.98.160    ip-172-19-124-249.cn-northwest-1.compute.internal   <none>           <none>
apisix-etcd-2                               1/1     Running   0          16h   172.19.119.222   ip-172-19-117-72.cn-northwest-1.compute.internal    <none>           <none>
apisix-ingress-controller-87d78d98d-dftls   1/1     Running   0          16h   172.19.97.227    ip-172-19-124-247.cn-northwest-1.compute.internal   <none>           <none>

我们把 apisix-etcd-2所在的k8s节点(ip-172-19-117-72.cn-northwest-1.compute.internal)设置为维护模式，并驱逐该节点的apisix-etcd-2 到其他节点上：

$ kubectl drain ip-172-19-117-72.cn-northwest-1.compute.internal --ignore-daemonsets --delete-local-data
node/ip-172-19-117-72.cn-northwest-1.compute.internal cordoned
WARNING: ignoring DaemonSet-managed Pods: default/prometheus-node-exporter-55rbb, kube-system/aws-node-m5xjb, kube-system/efs-csi-node-dlln4, kube-system/kube-proxy-xhjz7
evicting pod ingress-apisix/apisix-etcd-2
pod/apisix-etcd-2 evicted
node/ip-172-19-117-72.cn-northwest-1.compute.internal evicted

检查apisix-etcd-2迁移到新k8s节点后的运行状况，apisix-etcd-2在新的k8s节点(ip-172-19-97-147.cn-northwest-1.compute.internal) 一直处在CrashLoopBackOff状态：

$ kubectl get po -o wide -n ingress-apisix
NAME                                        READY   STATUS             RESTARTS   AGE   IP               NODE                                                NOMINATED NODE   READINESS GATES
apisix-78f5cb9b84-82czl                     1/1     Running            0          17h   172.19.115.235   ip-172-19-113-187.cn-northwest-1.compute.internal   <none>           <none>
apisix-78f5cb9b84-qczbv                     1/1     Running            0          17h   172.19.117.215   ip-172-19-112-190.cn-northwest-1.compute.internal   <none>           <none>
apisix-dashboard-5465fb4df7-9z8xg           1/1     Running            4          17h   172.19.116.190   ip-172-19-124-247.cn-northwest-1.compute.internal   <none>           <none>
apisix-etcd-0                               1/1     Running            0          17h   172.19.125.197   ip-172-19-119-174.cn-northwest-1.compute.internal   <none>           <none>
apisix-etcd-1                               1/1     Running            0          17h   172.19.98.160    ip-172-19-124-249.cn-northwest-1.compute.internal   <none>           <none>
apisix-etcd-2                               0/1     CrashLoopBackOff   8          16m   172.19.100.35    ip-172-19-97-147.cn-northwest-1.compute.internal    <none>           <none>
apisix-ingress-controller-87d78d98d-dftls   1/1     Running            0          17h   172.19.97.227    ip-172-19-124-247.cn-northwest-1.compute.internal   <none>           <none>

apisix-etcd-2 的日志：

$ kubectl logs apisix-etcd-2 -n ingress-apisix -f
etcd 01:28:08.43 
etcd 01:28:08.43 Welcome to the Bitnami etcd container
etcd 01:28:08.43 Subscribe to project updates by watching https://github.com/bitnami/bitnami-docker-etcd
etcd 01:28:08.43 Submit issues and feature requests at https://github.com/bitnami/bitnami-docker-etcd/issues
etcd 01:28:08.44 
etcd 01:28:08.44 INFO  ==> ** Starting etcd setup **
etcd 01:28:08.46 INFO  ==> Validating settings in ETCD_* env vars..
etcd 01:28:08.46 WARN  ==> You set the environment variable ALLOW_NONE_AUTHENTICATION=yes. For safety reasons, do not use this flag in a production environment.
etcd 01:28:08.47 INFO  ==> Initializing etcd
etcd 01:28:08.47 INFO  ==> Generating etcd config file using env variables
etcd 01:28:08.48 INFO  ==> Detected data from previous deployments
etcd 01:28:08.61 INFO  ==> Updating member in existing cluster
Error: bad member ID arg (strconv.ParseUint: parsing "": invalid syntax), expecting ID in Hex

gxthrj commented 2 years ago

cc @tao12345666333 PTAL, if you have time.

tao12345666333 commented 2 years ago

It may be related to the following:

tokers commented 2 years ago

How did you deploy the ETCD cluster? And what's the persistence way that you use?

NMichas commented 2 years ago

Just to add, I'm also facing this issue. The problem seems to be with the Bitnami chart. I went through the relevant discussions/issues and although some PRs have been merged that seem to help with this problem is not entirely solved yet.

I also tried with the latest Bitnami etcd chart (8.2.2) and I still observe the same error (easily reproduced when I do a helm upgrade --install for my chart which includes the Bitnami chart - or APISIX with etcd.enabled=true).

For now in dev, I'm using etcd.replicaCount=1 in APISIX.

NivHamisha commented 2 years ago

Is there any new information regarding this issue? Unfortunately, I am still facing it :(

tokers commented 2 years ago

Is there any new information regarding this issue? Unfortunately, I am still facing it :(

No, we have no clue, maybe you can also submit an issue to bitnami/charts. It looks like a bug there.

juzhiyuan commented 2 years ago

Use APISIX Helm chart with standalone deployment etcd

Thanks to @tao12345666333!

Create a namespace apisix

$ kubectl create ns apisix

namespace/apisix created

Deploy the etcd instance

# etcd-headless.yaml
apiVersion: v1
kind: Service
metadata:
  name: etcd-headless
  namespace: apisix
  labels:
    app.kubernetes.io/name: etcd
  annotations:
    service.alpha.kubernetes.io/tolerate-unready-endpoints: "true"
spec:
  type: ClusterIP
  clusterIP: None
  ports:
    - name: "client"
      port: 2379
      targetPort: client
    - name: "peer"
      port: 2380
      targetPort: peer
  selector:
    app.kubernetes.io/name: etcd
---
# etcd.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: etcd
  namespace: apisix
  labels:
    app.kubernetes.io/name: etcd
spec:
  selector:
    matchLabels:
      app.kubernetes.io/name: etcd
  serviceName: etcd-headless
  podManagementPolicy: Parallel
  replicas: 1
  updateStrategy:
    type: RollingUpdate
  template:
    metadata:
      labels:
        app.kubernetes.io/name: etcd
    spec:
      securityContext:
        fsGroup: 1001
        runAsUser: 1001
      containers:
        - name: etcd
          image: docker.io/bitnami/etcd:3.4.20-debian-11-r11
          imagePullPolicy: "IfNotPresent"
          # command:
            # - /scripts/setup.sh
          env:
            - name: BITNAMI_DEBUG
              value: "false"
            - name: MY_POD_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.podIP
            - name: MY_POD_NAME
              valueFrom:
                fieldRef:
                  fieldPath: metadata.name
            - name: ETCDCTL_API
              value: "3"
            - name: ETCD_NAME
              value: "$(MY_POD_NAME)"
            - name: ETCD_DATA_DIR
              value: /etcd/data
            - name: ETCD_ADVERTISE_CLIENT_URLS
              value: "http://$(MY_POD_NAME).etcd-headless.apisix.svc.cluster.local:2379"
            - name: ETCD_LISTEN_CLIENT_URLS
              value: "http://0.0.0.0:2379"
            - name: ETCD_INITIAL_ADVERTISE_PEER_URLS
              value: "http://$(MY_POD_NAME).etcd-headless.apisix.svc.cluster.local:2380"
            - name: ETCD_LISTEN_PEER_URLS
              value: "http://0.0.0.0:2380"
            - name: ALLOW_NONE_AUTHENTICATION
              value: "yes"
          ports:
            - name: client
              containerPort: 2379
            - name: peer
              containerPort: 2380
          volumeMounts:
            - name: data
              mountPath: /etcd
      # If you don't have a storage provisioner or don't want to use persistence volume, you could use an `emptyDir` as follow.
      # volumes:
      #   - name: data
      #     emptyDir: {}
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes:
          - "ReadWriteOnce"
        resources:
          requests:
            storage: "8Gi"

And then apply to kubernetes

$ kubectl apply -f etcd.yaml                                                                                                                                                                 

service/etcd-headless created                                                                                                                                                                             
statefulset.apps/etcd created

Deploy the APISIX Ingress

If you don't want to deploy APISIX Ingress, just change --set ingress-controller.enabled=false

$ helm install apisix apisix/apisix   --set gateway.type=NodePort   --set ingress-controller.enabled=true   --set ingress-controller.config.apisix.serviceNamespace=apisix   --namespace apisix   --create-namespace   --set ingresscontroller.config.apisix.serviceName=apisix-admin --set ingresscontroller.config.ingressPublishService="apisix/apisix-gateway" --set etcd.enabled=false --set etcd.host={"http://etcd-headless.apisix.svc.cluster.local:2379"}

NAME: apisix
LAST DEPLOYED: Fri Sep  9 08:54:57 2022
NAMESPACE: apisix
STATUS: deployed
REVISION: 1
TEST SUITE: None
NOTES:
1. Get the application URL by running these commands:
  export NODE_PORT=$(kubectl get --namespace apisix -o jsonpath="{.spec.ports[0].nodePort}" services apisix-gateway)
  export NODE_IP=$(kubectl get nodes --namespace apisix -o jsonpath="{.items[0].status.addresses[0].address}")
  echo http://$NODE_IP:$NODE_PORT

Check pods status

$ kubectl -n apisix get pods 

NAME                                         READY   STATUS    RESTARTS   AGE
apisix-579b99b87d-fhbhh                      1/1     Running   0          86s
apisix-ingress-controller-68d44b5d49-b427h   1/1     Running   0          86s
etcd-0                                       1/1     Running   0          20m

Verification

tao@moelove:~$ kubectl create ns apisix-demo
namespace/apisix-demo created

tao@moelove:~$ kubectl -n apisix-demo run httpbin --image kennethreitz/httpbin --port 80
pod/httpbin created

tao@moelove:~$ kubectl -n apisix-demo expose pod httpbin --port 80
service/httpbin exposed

Create route YAML (ar.yaml):

apiVersion: apisix.apache.org/v2beta3
kind: ApisixRoute
metadata:
  name: httpbin-route
  namespace: apisix-demo
spec:
  http:
  - name: httpbin
    match:
      hosts:
      - local.httpbin.org
      paths:
      - /*
    backends:
      - serviceName: httpbin
        servicePort: 80


tao@moelove:~$ vim ar.yaml 

tao@moelove:~$ kubectl -n apisix-demo apply -f ar.yaml 

apisixroute.apisix.apache.org/httpbin-route created

tao@moelove:~$ kubectl get pod,svc,ar
NAME                 TYPE        CLUSTER-IP   EXTERNAL-IP   PORT(S)   AGE
service/kubernetes   ClusterIP   10.96.0.1    <none>        443/TCP   38m
tao@moelove:~$ kubectl get pod,svc,ar -n apisix-demo
NAME          READY   STATUS    RESTARTS   AGE
pod/httpbin   1/1     Running   0          105s

NAME              TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)   AGE
service/httpbin   ClusterIP   10.96.229.144   <none>        80/TCP    94s

NAME                                          HOSTS                   URIS     AGE
apisixroute.apisix.apache.org/httpbin-route   ["local.httpbin.org"]   ["/*"]   20s

Try to visit the HTTPBIN service API.

$ tao@moelove:~$ export NODE_PORT=$(kubectl get --namespace apisix -o jsonpath="{.spec.ports[0].nodePort}" services apisix-gateway)

$ tao@moelove:~$ export NODE_IP=$(kubectl get nodes --namespace apisix -o jsonpath="{.items[0].status.addresses[0].address}")

$ tao@moelove:~$ echo http://$NODE_IP:$NODE_PORT                                                                                                    
http://172.18.0.5:32409

$ tao@moelove:~$ curl http://$NODE_IP:$NODE_PORT/anything -H "HOST: local.httpbin.org"
{
  "args": {}, 
  "data": "", 
  "files": {}, 
  "form": {}, 
  "headers": {
    "Accept": "*/*", 
    "Host": "local.httpbin.org", 
    "User-Agent": "curl/7.58.0", 
    "X-Forwarded-Host": "local.httpbin.org"
  }, 
  "json": null, 
  "method": "GET", 
  "origin": "172.18.0.5", 
  "url": "http://local.httpbin.org/anything"
}

ahululu commented 1 year ago

Is there any new information regarding this issue? Unfortunately, I am still facing it :(

I have also faced this problem. I first modify the statefulset's replicas to 2, then delete the PVC, and finally modify the replicas to 3 to finally solve this problem.

apache / apisix-helm-chart

request help: bad member ID arg (strconv.ParseUint: parsing "": invalid syntax), expecting ID in Hex #290

Use APISIX Helm chart with standalone deployment etcd