m3db / m3db-operator

Kubernetes operator for M3DB
https://m3db.io/docs/operator/
Apache License 2.0
138 stars 39 forks source link

If configmap m3db-cluster is specified, bootstrap cannot be work, and shards cannot be initialized and does not support https etcd #246

Open liuyongqing1 opened 3 years ago

liuyongqing1 commented 3 years ago

Thanks for opening an issue for the M3DB Operator! We'd love to help you, but we need the following information included with any issue:

my configmap

apiVersion: v1
kind: ConfigMap
metadata:
  name: m3db-config-map-m3db-cluster
  namespace: monitoring
data:
  m3.yml: |
    coordinator:
      listenAddress:
        type: "config"
        value: "0.0.0.0:7201"
      local:
        namespaces:
        - namespace: default
          type: unaggregated
          retention: 48h
      metrics:
        scope:
          prefix: "coordinator"
        prometheus:
          handlerPath: /metrics
          listenAddress: 0.0.0.0:7203
        sanitization: prometheus
        samplingRate: 1.0
        extended: none
      tagOptions:
        idScheme: quoted

    db:
      logging:
        level: debug

      metrics:
        prometheus:
          handlerPath: /metrics
        sanitization: prometheus
        samplingRate: 1.0
        extended: detailed

      listenAddress: 0.0.0.0:9000
      clusterListenAddress: 0.0.0.0:9001
      httpNodeListenAddress: 0.0.0.0:9002
      httpClusterListenAddress: 0.0.0.0:9003
      debugListenAddress: 0.0.0.0:9004

      hostID:
        resolver: file
        file:
          path: /etc/m3db/pod-identity/identity
          timeout: 5m

      client:
        writeConsistencyLevel: majority
        readConsistencyLevel: unstrict_majority

      gcPercentage: 100

      writeNewSeriesAsync: true
      writeNewSeriesLimitPerSecond: 1048576
      writeNewSeriesBackoffDuration: 2ms

      bootstrap:
        bootstrappers:
            - filesystem
            - commitlog
            - peers
            - uninitialized_topology
        fs:
            numProcessorsPerCPU: 0.125
        commitlog:
            # https://docs.m3db.io/operational_guide/availability_consistency_durability/
            returnUnfulfilledForCorruptCommitLogFiles: false

      commitlog:
        flushMaxBytes: 524288
        flushEvery: 1s
        queue:
            calculationType: fixed
            size: 2097152

      fs:
        filePathPrefix: /var/lib/m3db

      config:
        service:
            env: default_env
            zone: embedded
            service: m3db
            cacheDir: /var/lib/m3kv
            etcdClusters:
            - zone: embedded
              endpoints:
              - https://172.20.51.11:2379
              - https://172.20.51.12:2379
              - https://172.20.51.13:2379
              tls:
                crtPath: /etc/m3db/etcd.pem
                caCrtPath: /etc/m3db/ca.pem
                keyPath: /etc/m3db/etcd-key.pem
  ca.pem: |
    -----BEGIN CERTIFICATE-----
    MIIDtjCCAp6gAwIBAgIUX0WIXFYQ7hdjavd1yfQKndMsyUAwDQYJKoZIhvcNAQEL
    BQAwYTELMAkGA1UEBhMCQ04xETAPBgNVBAgTCEhhbmdaaG91MQswCQYDVQQHEwJY
    UzEMMAoGA1UEChMDazhzMQ8wDQYDVQQLEwZTeXN0ZW0xEzARBgNVBAMTCmt1YmVy
    bmV0ZXMwHhcNMTkwMjI0MDMzNjAwWhcNMzQwMjIwMDMzNjAwWjBhMQswCQYDVQQG
    EwJDTjERMA8GA1UECBMISGFuZ1pob3UxCzAJBgNVBAcTAlhTMQwwCgYDVQQKEwNr
    OHMxDzANBgNVBAsTBlN5c3RlbTETMBEGA1UEAxMKa3ViZXJuZXRlczCCASIwDQYJ
    KoZIhvcNAQEBBQADggEPADCCAQoCggEBAMk8b2jZwlg0GXLFFXwO6hFsv4loJgkR
    pNrdECgH7R0wZUfdprjBVY9r1lMMIjJiDQ4oWzkVBlm1vYDOpEfi13+HZJz1eDHp
    xDCe3yGXmIFFR0kBpFu3ytH7W+uaRFRmn+C0LL7bwFpgm/QMn0ZDdgjY1Dk3+D0Z
    68L/LpLd8y0kDkbc1WI9+F6TZEwdqpo2+DxKYA7Ds0y2bDYHc+GgECSZttMTg+XQ
    VCoG6yTatrSjJJO/vfQyTUPZDakt/K43Vy2XYnEG1cPmIWDy5DWeNANSEbYUEHhq
    uE3DCKTCIcVtltwy4bNUVC9au9M3pDam/xKZkX/8/sS2fZ4D01qg2GUCAwEAAaNm
    MGQwDgYDVR0PAQH/BAQDAgEGMBIGA1UdEwEB/wQIMAYBAf8CAQIwHQYDVR0OBBYE
    FK1z7VCQarn6QDTCySW7q4eWqqTIMB8GA1UdIwQYMBaAFK1z7VCQarn6QDTCySW7
    q4eWqqTIMA0GCSqGSIb3DQEBCwUAA4IBAQBMnlvlS/agLCApe0U726xI7Q7uEUv+
    rtCkpJFvzdnjjXKBIAEOlpTvZVdBbaod2CEodhEHfRktKSgRPFkN3EigRjtTtJI1
    wiEGLQA1J9sVdUk9GmagCCcb4lCSzaJG6Ba2SiQT+/h3LM3DZxBuKJZlvrksJmPn
    KoPJNsu7JMlgXLf9H4dXZ7uHO4+g2mgOcXNdpXFJ0HQwSH6bMfLnFrQTJ6ajbOiZ
    gyW/SHaGy7SrC0mJqodhOf4LxF4gH4XgSh2GyZaBMyLxuDhio+5V7YSzOSEl3lil
    QXXTu8uh4e/BRuD8Jmedusvp1fquYbEhzHCBdSS38xUJumrETkzccm6a
    -----END CERTIFICATE-----
  etcd.pem: |
    -----BEGIN CERTIFICATE-----
    MIID4jCCAsqgAwIBAgIUa/rFQ9xUGg3YZ1I0WK8LdEL+1PMwDQYJKoZIhvcNAQEL
    BQAwYTELMAkGA1UEBhMCQ04xETAPBgNVBAgTCEhhbmdaaG91MQswCQYDVQQHEwJY
    UzEMMAoGA1UEChMDazhzMQ8wDQYDVQQLEwZTeXN0ZW0xEzARBgNVBAMTCmt1YmVy
    bmV0ZXMwHhcNMTkwNzE3MTE0NTAwWhcNMjkwNzE0MTE0NTAwWjBbMQswCQYDVQQG
    EwJDTjERMA8GA1UECBMISGFuZ1pob3UxCzAJBgNVBAcTAlhTMQwwCgYDVQQKEwNr
    OHMxDzANBgNVBAsTBlN5c3RlbTENMAsGA1UEAxMEZXRjZDCCASIwDQYJKoZIhvcN
    AQEBBQADggEPADCCAQoCggEDANj+orhgn8KbbMU5v/fET2cTORdDSgRbmFyfLwS+
    o7l8TyjMK+FgKby8rFpr4WOMEvlyunESKg6Ky97l21HQsTBmbNn+oii6vFAoVE3k
    IN0JQxUGq+lhO4EioviRSeO2Gmqzkm3Gg1v0sLpvOwUTdizMI+e7Wy5lYbZGlCYV
    yeCvs70NXEsrLtHrBgO/vWhO99pkufBX1Rw9qgi75l8kRXUtSrl4KXss8bJPUczO
    MM1nH1RQ4Blc6JUwG4ZqdE2lkt7JXJOcWJH5z3Z+CAI89rvkNYZ8LbHD3Kt5Tixu
    Gt0VMWi8TbV8mPqKO6suVgABf+iYgcifRj6MSumZdN/FUusCAwEAAaOBlzCBlDAO
    BgNVHQ8BAf8EBAMCBaAwHQYDVR0lBBYwFAYIKwYBBQUHAwEGCCsGAQUFBwMCMAwG
    A1UdEwEB/wQCMAAwHQYDVR0OBBYEFG9RysZ9UxLn0F0nES0JvSpmdoAkMB8GA1Ud
    IwQYMBaAFK1z7VCQarn6QDTCySW7q4eWqqTIMBUGA1UdEQQOMAyHBH8AAAGHBKwU
    MwswDQYJKoZIhvcNAQELBQADggEBALZedL7gzhrBA0u9adVKIJu7e4nX/2cEKV0S
    qOZmLOb7uHgxYLwuzgmEwDckScUPLc9BhGcPURgFZtJgYC/bkdmuJPhRmId7GeX3
    rjuNxXfslPVRhWTA55N43zQmUDyRf72M0UF6I0PZ/53RKB/RVUiEBGWa5vP2dQwN
    YlDpKSIMxxHuupgPBPoq+y2HMNWmi6+/RydyJM3w7UqheyvrnSZGYfTce0H3ihMY
    Gs5WM0ai5hKF+GI8T4u1DTDJqrMjiGnAMoWzMX+7KYak2LWVytttwLYpe8D/Wm0o
    Rlf0fhJmHfXQzbMgHx+px710SS7/Yl1bRgeVDBKqnk2b2NC0MYo=
    -----END CERTIFICATE-----
  etcd-key.pem: |
    -----BEGIN RSA PRIVATE KEY-----
    MIIEpAIBAAKCAQEA2P6iuGCfwptsxTm/98RPZxM5F0NKBFuYXJ8vBL6juXxPKMwr
    4WApvLysWmvhY4wS+XK6cRIqDdrL3uXbUdCxMGZs2f6iKLq8UChUTeQg3QlDFQar
    6WE7gSKi+JFJ47YaarOSaDaDW/Swum87BRN2LMwj57tbLmVhtkaUJhXJ4K+zvQ1c
    Sysu0esGA7+9aE732mS58FfVHD2qCLvmXyRFdS1KuXgpeyzxsk9RzM4wzWcfVFDg
    GVzolTAbhmp0TaWS3slck5xYkfnPdn4IAjz2u+Q1hnwtscPcq3lOLG4a3RUxaLxN
    tXyY+oo7qy5WAAGT6JiByJ9GPoxK6Zl038VS6wIDAQABAoIBAHKMWRHD0BJHQfAL
    QE9nDhN3jle9acFLKO8cCRIUIRG1kYQT48YhoWbEoqdI8749H3cXHVy7HgB3PI/5
    /wD9jcvjBes+BBREH0yhPX+wwbhtP0BGOVIFxgexZR6ac8sFQoS5Lr9MX+OXFAQW
    260eTO/xA7M8sDGZyy8Rqvs/3UYCAZW0onocc0/DptKDAjrKs5Ky8+P5N98PDYPv
    fOo377nqtcp9vgsNOrbWMBu4D2lgqLxwoFSgDFoS9KXHZFcw/VA4mps8P+aJBlDM
    8V5KfufZimuOt/LrfHY8MO+WHx2qh1vVzDzWwlCR0pmxMzKxAevIO4HR1Q03gL1K
    v3Tbn6ECgYEA5po/erVFqHaiBwZRI1KQ9cw6te+57+Xh4lTuoq9fficSIewdwphs
    8bF77kN28B2Mw7J0KqEf8NfCkC6jMgR1y2JdIMrivyN7KyaSK3dJEwMccIDuozX5
    keef9Dbb1vj4YlqiDtmd4OZzVALCaeT1vqbku5vqdLSsOVHF1k2HYucCgYEA8OS4
    GFpDUBmzrjp7Fkw6CgHNgA8VF0MI5jJUSa659ejUoKscW95+P33+DGnB9njbnXOt
    jWI2vqhjZsDakj4i7fwvxZjm3L+PF6y/p4l5SE+Npn9IyV5usDeLv96cEm7cCf+X
    Lnih9YA+EfzeUKdQOzf6V8kYsXXkWJg+rDge010CgYAOXBKR4JHa4LBMQa9xxKV0
    OOh7BdeNQcJkJqfJh6QppeMyK5La2EUIc+Xku1y/rQdj9EvZj7j+dWEPO2g8KBzx
    sklcTmX6QwpbcIZvoHjzbyEpPE4f6a+Fz2edfIEKDOziqwQmapSzOYZ698UFdRV8
    bsYVjKr343xKAXaRVriUhwKBgQC2XZB47yxyYWLDjYZNVRvDI6Y9Qi3HVHpSOtvQ
    hDRH1CHUGHX5nrfYxHslTpMGUmyAAGjs1eN35uaJjYpqmBu9auOHhb+QcnyTgbX9
    0Xc9pOwplca2m4TUZtinQpGI6uAtuY7sIWsK/jD/UR3ElUWJ71DYUGcfQY7C+07G
    9h1wCQKBgQDHJdlF6mG0OQBA9pWGO7XJ7/kW3Vg5ddsgG6uSujXSAavYMuchxoSP
    l3DjPLzYqVF8t9eE8cF0kbUsShnzk+9eHuwl/9vhXcDkzPXdOkS/1OBMJX1zlEzT
    sGME9i36TzCWnyiX72w3Js/TXrQDZlwvRl/tgJLSgJGjt+ibO5CIFA==
    -----END RSA PRIVATE KEY-----

my m3db-cluster

apiVersion: operator.m3db.io/v1alpha1
kind: M3DBCluster
metadata:
  name: m3db-cluster
  namespace: monitoring
spec:
  image: m3dbnode:latest
  replicationFactor: 3
  numberOfShards: 256
  isolationGroups:
  - name: group1
    numInstances: 1
    nodeAffinityTerms:
    - key: kubernetes.io/hostname
      values:
      - 172.20.51.11
  - name: group2
    numInstances: 1
    nodeAffinityTerms:
    - key: kubernetes.io/hostname
      values:
      - 172.20.51.12
  - name: group3
    numInstances: 1
    nodeAffinityTerms:
    - key: kubernetes.io/hostname
      values:
      - 172.20.51.13
  podIdentityConfig:
    sources:
    - NodeName
  namespaces:
    - name: default
#      preset: 1m:40d
      options:
        bootstrapEnabled: true
        flushEnabled: true
        writesToCommitLog: true
        cleanupEnabled: true
        repairEnabled: false
        snapshotEnabled: true
        retentionOptions:
          retentionPeriod: 4320h
          blockSize: 24h
          bufferFuture: 20m
          bufferPast: 20m
          blockDataExpiry: true
          blockDataExpiryAfterNotAccessPeriod: 5m
        indexOptions:
          enabled: true
          blockSize: 24h
  configMapName: m3db-config-map-m3db-cluster
  containerResources:
    requests:
      memory: 4Gi
      cpu: '1'
    limits:
      memory: 8Gi
      cpu: '2'
  dataDirVolumeClaimTemplate:
    metadata:
      name: m3db-data
    spec:
      accessModes:
      - ReadWriteOnce
      storageClassName: rook-ceph-block
      resources:
        requests:
          storage: 200Gi

Bootstrap and shards cannot be initialized normally after pod and configmap are created m3db-operator shows current unavailable instances, m3db shows no bootstrap and shards

m3db log

{"level":"info","ts":1603975525.0615256,"msg":"no m3msg server configured"}
{"level":"info","ts":1603975525.061538,"msg":"using registered interrupt handler"}
{"level":"info","ts":1603975525.0615675,"msg":"starting API server","address":"[::]:7201"}
{"level":"info","ts":1603975526.6106744,"msg":"tracing disabled; set `tracing.backend` to enable"}
{"level":"info","ts":1603975526.6107037,"msg":"no seed nodes set, using dedicated etcd cluster"}
{"level":"warn","ts":1603975529.2892141,"msg":"max index query IDs concurrency was not set, falling back to default value"}
{"level":"warn","ts":1603975529.289353,"msg":"host doesn't support HugeTLB, proceeding without it"}
{"level":"info","ts":1603975529.3279648,"msg":"set thrift bytes pool alloc size","size":2048}
{"level":"info","ts":1603975529.328014,"msg":"bytes pool configured","capacity":16,"size":524288,"refillLowWaterMark":0.3,"refillHighWaterMark":0.6}
{"level":"info","ts":1603975529.328026,"msg":"bytes pool configured","capacity":32,"size":262144,"refillLowWaterMark":0.3,"refillHighWaterMark":0.6}
{"level":"info","ts":1603975529.3280313,"msg":"bytes pool configured","capacity":64,"size":131072,"refillLowWaterMark":0.3,"refillHighWaterMark":0.6}
{"level":"info","ts":1603975529.328036,"msg":"bytes pool configured","capacity":128,"size":65536,"refillLowWaterMark":0.3,"refillHighWaterMark":0.6}
{"level":"info","ts":1603975529.3280413,"msg":"bytes pool configured","capacity":256,"size":65536,"refillLowWaterMark":0.3,"refillHighWaterMark":0.6}
{"level":"info","ts":1603975529.3280458,"msg":"bytes pool configured","capacity":1440,"size":16384,"refillLowWaterMark":0.3,"refillHighWaterMark":0.6}
{"level":"info","ts":1603975529.3280504,"msg":"bytes pool configured","capacity":4096,"size":8192,"refillLowWaterMark":0.3,"refillHighWaterMark":0.6}
{"level":"info","ts":1603975529.328117,"msg":"bytes pool init start"}
{"level":"info","ts":1603975529.951818,"msg":"bytes pool init end"}
{"level":"info","ts":1603975530.1808083,"msg":"creating dynamic config service client with m3cluster"}
{"level":"info","ts":1603975530.1822543,"msg":"successfully created new cache dir","path":"/var/lib/m3kv","mode":493}
{"level":"warn","ts":1603975530.1822956,"msg":"could not load cache from file","file":"/var/lib/m3kv/_kv_monitoring_m3db-cluster_m3db_embedded.json","error":"error opening cache file /var/lib/m3kv/_kv_monitoring_m3db-cluster_m3db_embedded.json: open /var/lib/m3kv/_kv_monitoring_m3db-cluster_m3db_embedded.json: no such file or directory"}
{"level":"info","ts":1603975530.5168633,"msg":"node tchannelthrift: listening","address":"0.0.0.0:9000"}
{"level":"info","ts":1603975530.517485,"msg":"node httpjson: listening","address":"0.0.0.0:9002"}
{"level":"info","ts":1603975530.5175393,"msg":"waiting for dynamic topology initialization, if this takes a long time, make sure that a topology/placement is configured"}
{"level":"info","ts":1603975530.5175579,"msg":"adding a watch","service":"m3db","env":"monitoring/m3db-cluster","zone":"embedded","includeUnhealthy":true}
{"level":"info","ts":1603975530.5176663,"msg":"successfully created new cache dir","path":"/var/lib/m3kv","mode":493}
{"level":"warn","ts":1603975530.517716,"msg":"could not load cache from file","file":"/var/lib/m3kv/m3db_embedded.json","error":"error opening cache file /var/lib/m3kv/m3db_embedded.json: open /var/lib/m3kv/m3db_embedded.json: no such file or directory"}
{"level":"warn","ts":1603975534.6081727,"msg":"invalid configuration found, refer to linked documentation for more information","url":"https://docs.m3db.io/operational_guide/kernel_configuration","error":"current value for RLIMIT_NOFILE(1048576) is below recommended threshold(3000000)\nmax value for RLIMIT_NOFILE(1048576) is below recommended threshold(3000000)\ncurrent value for vm.max_map_count(655360) is below recommended threshold(3000000)","errorCauses":[{"error":"current value for vm.max_map_count(655360) is below recommended threshold(3000000)"},{"error":"current value for RLIMIT_NOFILE(1048576) is below recommended threshold(3000000)"},{"error":"max value for RLIMIT_NOFILE(1048576) is below recommended threshold(3000000)"}]}
{"level":"info","ts":1603975543.699024,"msg":"successfully created new cache dir","path":"/var/lib/m3kv","mode":493}
{"level":"warn","ts":1603975543.6991584,"msg":"could not load cache from file","file":"/var/lib/m3kv/_kv_monitoring_m3db-cluster-without-etcd_m3db_embedded.json","error":"error opening cache file /var/lib/m3kv/_kv_monitoring_m3db-cluster-without-etcd_m3db_embedded.json: open /var/lib/m3kv/_kv_monitoring_m3db-cluster-without-etcd_m3db_embedded.json: no such file or directory"}

m3db-operator log

2020-10-30T08:47:18.325Z    DEBUG   namespace retrieved {"controller": "m3db-cluster-controller"}
2020-10-30T08:47:18.426Z    DEBUG   response received   {"controller": "m3db-cluster-controller", "action": "GET", "url": "http://m3coordinator-m3db-cluster.monitoring:7201/api/v1/services/m3db/placement", "requestDump": "GET /api/v1/services/m3db/placement HTTP/1.1\r\nHost: m3coordinator-m3db-cluster.monitoring:7201\r\nCluster-Environment-Name: monitoring/m3db-cluster\r\nContent-Type: application/json\r\n\r\n", "status": "200 OK", "responseDump": "HTTP/1.1 200 OK\r\nConnection: close\r\nAccess-Control-Allow-Headers: accept, content-type, authorization\r\nAccess-Control-Allow-Methods: POST, GET, OPTIONS, PUT, DELETE\r\nAccess-Control-Allow-Origin: *\r\nContent-Type: application/json\r\nDate: Fri, 30 Oct 2020 08:47:18 GMT\r\n\r\n{\"placement\":{\"instances\":{\"{\\\"name\\\":\\\"m3db-cluster-rep0-0\\\",\\\"nodeName\\\":\\\"172.20.51.11\\\"}\":{\"id\":\"{\\\"name\\\":\\\"m3db-cluster-rep0-0\\\",\\\"nodeName\\\":\\\"172.20.51.11\\\"}\",\"isolationGroup\":\"group1\",\"zone\":\"embedded\",\"weight\":100,\"endpoint\":\"m3db-cluster-rep0-0.m3dbnode-m3db-cluster:9000\",\"shards\":[{\"id\":0,\"state\":\"INITIALIZING\",\"sourceId\":\"\",\"cutoverNanos\":\"0\",\"cutoffNanos\":\"0\"},{\"id\":1,\"state\":\"INITIALIZING\",\"sourceId\":\"\",\"cutoverNanos\":\"0\",\"cutoffNanos\":\"0\"},{\"id\":2,\"state\":\"INITIALIZING\",\"sourceId\":\"\",\"cutoverNanos\":\"0\",\"cutoffNanos\":\"0\"},{\"id\":3,\"state\":\"INITIALIZING\",\"sourceId\":\"\",\"cutoverNanos\":\"0\",\"cutoffNanos\":\"0\"},{\"id\":4,\"state\":\"INITIALIZING\",\"sourceId\":\"\",\"cutoverNanos\":\"0\",\"cutoffNanos\":\"0\"},{\"id\":5,\"state\":\"INITIALIZING\",\"sourceId\":\"\",\"cutoverNanos\":\"0\",\"cutoffNanos\":\"0\"},{\"id\":6,\"state\":\"INITIALIZING\",\"sourceId\":\"\",\"cutoverNanos\":\"0\",\"cutoffNanos\":\"0\"},{\"id\":7,\"state\":\"INITIALIZING\",\"sourceId\":\"\",\"cutoverNanos\":\"0\",\"cutoffNanos\":\"0\"},{\"id\":8,\"state\":\"INITIALIZING\",\"sourceId\":\"\",\"cutoverNanos\":\"0\",\"cutoffNanos\":\"0\"},{\"id\":9,\"state\":\"INITIALIZING\",\"sourceId\":\"\",\"cutoverNanos\":\"0\",\"cutoffNanos\":\"0\"},{\"id\":10,\"state\":\"INITIALIZING\",\"sourceId\":\"\",\"cutoverNanos\":\"0\",\"cutoffNanos\":\"0\"},{\"id\":11,\"state\":\"INITIALIZING\",\"sourceId\":\"\",\"cutoverNanos\":\"0\",\"cutoffNanos\":\"0\"},{\"id\":12,\"state\":\"INITIALIZING

2020-10-30T08:47:18.453Z    INFO    successfully synced item    {"controller": "m3db-cluster-controller", "key": "monitoring/m3db-cluster"}
2020-10-30T08:47:18.453Z    INFO    Event(v1.ObjectReference{Kind:"M3DBCluster", Namespace:"monitoring", Name:"m3db-cluster", UID:"779a3e68-1a8c-11eb-a926-44a842238374", APIVersion:"operator.m3db.io/v1alpha1", ResourceVersion:"561929489", FieldPath:""}): type: 'Warning' reason: 'TimeLongerThanUsual' current unavailable instances: 3   {"controller": "m3db-cluster-controller"}

m3db-operator not support etcd ssl

gemiit commented 3 years ago

image