thingsboard / thingsboard-ce-k8s

ThingsBoard Community Edition Kubernetes scripts and docs
Apache License 2.0
69 stars 89 forks source link

kafka configuration for microservices deploy on GCP: no space left on device #50

Open ilseva opened 2 years ago

ilseva commented 2 years ago

Hi! We successfully deployed the version 3.3.2 on Google Kubernetes Engine, but with only 4 devices that monitor energy consumption and 6 rule chains that perform a simple anomaly detection, the kafka pod consume all the space dedicated to the logs.

We didn't change the configuration contained in thirdparty.yml.

The question is: did you test this kafka configuration with a certain number of devices and rule chains?

mistadave commented 2 years ago

I've had the same problem. It's a problem with the given VolumeSize and the Kafka config.

Since in the Kafka Env the Log Retention is set to 104MB for each Topic, but only 200Mb Storage for the volume is claimed, Kafka will run out of space on device by the given yaml config!

Fix will be to set the Volume claim to the correct size, depending of your count of Topics and replica (default 1).

Also will ThingsBoard Transports create topics, depending whether you deploy only http, or additional such as MQTT, COAP etc. I'm not sure where this topics are created. I've deployed also the MQTT transport and have 4 additional topics to the given in the env config.

My Setup (HTTP - and MQTT transport) with rest default has 74 Topics. Which means, Kafka will consume up to at least 740MB by the given topic retention.bytes config.

My fix was to set the volume claim of **logs** to 2Gi. But I'm not sure if further Topics will be created for other rule chains etc. ?

Stackoverflow How to calculate the value of log retention byte

Here the snipped with some comment's of the default yaml conf from the thirdparty.yml file.

  env:
      ...
      - name: KAFKA_CREATE_TOPICS # Retention.bytes set to 104857600 Byte( ~104MB)
        value: "js_eval.requests:100:1:delete --config=retention.ms=60000 --config=segment.bytes=26214400 --config=retention.bytes=104857600,tb_transport.api.requests:30:1:delete --config=retention.ms=60000 --config=segment.bytes=26214400 --config=retention.bytes=104857600,tb_rule_engine:30:1:delete --config=retention.ms=60000 --config=segment.bytes=26214400 --config=retention.bytes=104857600"
      - name: KAFKA_AUTO_CREATE_TOPICS_ENABLE
        value: "false"
      - name: KAFKA_LOG_RETENTION_BYTES # set to ~1073 MB
        value: "1073741824"
      - name: KAFKA_LOG_SEGMENT_BYTES # set to ~ 268MB
        value: "268435456"
      - name: KAFKA_LOG_RETENTION_MS
        value: "300000"
      - name: KAFKA_LOG_CLEANUP_POLICY
        value: "delete"
      - name: KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR
        value: "1"
      - name: KAFKA_TRANSACTION_STATE_LOG_MIN_ISR
        value: "1"
      - name: KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR
        value: "1"
      - name: KAFKA_ZOOKEEPER_CONNECTION_TIMEOUT_MS
        value: "3000"
      - name: KAFKA_PORT
        value: "9092"
      - name: KAFKA_LOG_DIRS # Path where logs will be stored
        value: "/kafka-logs/kafka"
  volumeMounts:
        - name: logs
          mountPath: /kafka-logs
          readOnly: false
        - name: start
...
volumeClaimTemplates:
    - metadata:
        name: logs
      spec:
        accessModes:
          - ReadWriteOnce
        resources:
          requests:
            storage: 200Mi # Storage size for Kafka log Fix this size
ilseva commented 2 years ago

thanks @mistadave! At the end I resolved my problem after some attempts to change kafka configuration and tb-node configuration. Thingsboard backend create topics on kafka with default values for retention and segment. This is my tb-kafka-configmap.yaml

apiVersion: v1
kind: ConfigMap
metadata:
  name: tb-kafka-config
  namespace: thingsboard
  labels:
    name: tb-kafka-config
data:
  TB_QUEUE_TYPE: kafka
  TB_KAFKA_SERVERS: tb-kafka:9092
  TB_QUEUE_KAFKA_CORE_TOPIC_PROPERTIES: cleanup.policy:delete;segment.ms:60000;retention.ms:300000;segment.bytes:52428800;retention.bytes:52428800;partitions:1;min.insync.replicas:1
  TB_QUEUE_KAFKA_RE_TOPIC_PROPERTIES: cleanup.policy:delete;segment.ms:60000;retention.ms:300000;segment.bytes:52428800;retention.bytes:52428800;partitions:1;min.insync.replicas:1

and this is the env section on tb-kafka service in thirdparty.yaml

env:
            - name: BROKER_ID_COMMAND
              value: "hostname | cut -d'-' -f3"
            - name: KAFKA_ZOOKEEPER_CONNECT
              value: "zookeeper:2181"
            - name: KAFKA_LISTENERS
              value: "INSIDE://:9092"
            - name: KAFKA_ADVERTISED_LISTENERS
              value: "INSIDE://:9092"
            - name: KAFKA_LISTENER_SECURITY_PROTOCOL_MAP
              value: "INSIDE:PLAINTEXT"
            - name: KAFKA_INTER_BROKER_LISTENER_NAME
              value: "INSIDE"
            - name: KAFKA_CREATE_TOPICS
              value: "js_eval.requests:100:1:delete --config=segment.ms=60000 --config=retention.ms=60000 --config=retention.bytes=524288000 --config=segment.bytes=5242880,tb_transport.api.requests:30:1:delete --config=segment.ms=60000 --config=retention.ms=60000 --config=retention.bytes=157286400 --config=segment.bytes=5242880,tb_rule_engine:30:1:delete --config=segment.ms=60000 --config=retention.ms=60000 --config=retention.bytes=157286400 --config=segment.bytes=5242880"
            - name: KAFKA_AUTO_CREATE_TOPICS_ENABLE
              value: "false"
            - name: KAFKA_LOG_SEGMENT_BYTES
              value: "52428800"
            - name: KAFKA_LOG_RETENTION_BYTES
              value: "52428800"
            - name: KAFKA_LOG_RETENTION_MS
              value: "300000"
            - name: KAFKA_LOG_CLEANUP_POLICY
              value: "delete"
            - name: KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR
              value: "1"
            - name: KAFKA_TRANSACTION_STATE_LOG_MIN_ISR
              value: "1"
            - name: KAFKA_TRANSACTION_STATE_LOG_REPLICATION_FACTOR
              value: "1"
            - name: KAFKA_ZOOKEEPER_CONNECTION_TIMEOUT_MS
              value: "3000"
            - name: KAFKA_PORT
              value: "9092"
            - name: KAFKA_LOG_DIRS
              value: "/kafka-logs/kafka"
            - name: KAFKA_LOG_RETENTION_CHECK_INTERVAL_MS
              value: "300000"
            - name: KAFKA_OFFSETS_TOPIC_SEGMENT_BYTES
              value: "5242880"
            - name: KAFKA_OFFSETS_RETENTION_MINUTES
              value: "5"
            - name: KAFKA_LOG_ROLL_HOURS
              value: "2"
            - name: KAFKA_LOG_CLEANER_THREADS
              value: "8"    

maybe can be useful for someone...