confluentinc / cp-helm-charts

The Confluent Platform Helm charts enable you to deploy Confluent Platform services on Kubernetes for development, test, and proof of concept environments.
https://cnfl.io/getting-started-kafka-kubernetes
Apache License 2.0
790 stars 843 forks source link

Deploying cluster on K8s with ArgoCD #632

Closed sfl0r3nz05 closed 1 year ago

sfl0r3nz05 commented 1 year ago

Hi. I am trying to deploy the same kafka cluster (already deployed in minikube) on k8s.

image

As you can see, the pods have not been deployed properly and having connection errors.

Error logs for kafka-cp-control-center:

[main] WARN org.apache.kafka.clients.ClientUtils - Couldn't resolve server PLAINTEXT://kafka-cp-kafka-headless:9092 from bootstrap.servers as DNS resolution failed for kafka-cp-kafka-headless
[main] ERROR io.confluent.admin.utils.cli.KafkaReadyCommand - Error while running kafka-ready.
org.apache.kafka.common.KafkaException: Failed to create new KafkaAdminClient
    at org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:535)
    at org.apache.kafka.clients.admin.Admin.create(Admin.java:75)
    at org.apache.kafka.clients.admin.AdminClient.create(AdminClient.java:49)
    at io.confluent.admin.utils.ClusterStatus.isKafkaReady(ClusterStatus.java:138)
    at io.confluent.admin.utils.cli.KafkaReadyCommand.main(KafkaReadyCommand.java:150)
Caused by: org.apache.kafka.common.config.ConfigException: No resolvable bootstrap urls given in bootstrap.servers
    at org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:89)
    at org.apache.kafka.clients.ClientUtils.parseAndValidateAddresses(ClientUtils.java:48)
    at org.apache.kafka.clients.admin.KafkaAdminClient.createInternal(KafkaAdminClient.java:489)
    ... 4 more

Error logs for kafka-cp-kafka-connect:

Caused by: java.net.ConnectException: Connection refused (Connection refused)
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at java.net.Socket.connect(Socket.java:538)
    at java.net.Socket.<init>(Socket.java:434)
    at java.net.Socket.<init>(Socket.java:211)
    at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40)
    at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:148)
    at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613)
    ... 26 more

Error logs for kafka-cp-kafka-rest:

Caused by: java.net.ConnectException: Connection refused (Connection refused)
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at java.net.Socket.connect(Socket.java:538)
    at java.net.Socket.<init>(Socket.java:434)
    at java.net.Socket.<init>(Socket.java:211)
    at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40)
    at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:148)
    at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613)
    ... 27 more

Error logs for kafka-cp-ksql-server:

Caused by: java.net.ConnectException: Connection refused (Connection refused)
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at java.net.Socket.connect(Socket.java:538)
    at java.net.Socket.<init>(Socket.java:434)
    at java.net.Socket.<init>(Socket.java:211)
    at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40)
    at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:148)
    at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613)
    ... 26 more

Error logs for kafka-cp-ksql-server:

Caused by: java.net.ConnectException: Connection refused (Connection refused)
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at java.net.Socket.connect(Socket.java:538)
    at java.net.Socket.<init>(Socket.java:434)
    at java.net.Socket.<init>(Socket.java:211)
    at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40)
    at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:148)
    at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613)
    ... 26 more

Any recommendations from anyone who has deployed the cluster on k8s?

OneCricketeer commented 1 year ago

DNS resolution failed for kafka-cp-kafka-headless

Your image shows no healthy Kafka pods.

Therefore, everything will fail to start because Kafka is not running, as a dependency.

sfl0r3nz05 commented 1 year ago

[Update]

Thanks @OneCricketeer, in the values.yaml file I fullfilled the bootstrapServers of all artifacts as "http://confluentinc-cp-kafka:9092".

root@master:~/cp-helm-charts# kubectl get all -n kafka
NAME                                                  READY   STATUS             RESTARTS         AGE
pod/confluentinc-cp-control-center-5cd976b498-7lkn6   1/1     Running            0                131m
pod/confluentinc-cp-zookeeper-0                       2/2     Running            0                131m
pod/confluentinc-cp-ksql-server-5bb5d679d9-5jvjk      2/2     Running            0                131m
pod/confluentinc-cp-kafka-0                           2/2     Running            0                131m
pod/confluentinc-cp-kafka-rest-58bd995bf4-f8kpf       2/2     Running            1 (129m ago)     131m
pod/confluentinc-cp-schema-registry-d458757c9-z229g   1/2     CrashLoopBackOff   28 (31s ago)     131m
pod/confluentinc-cp-kafka-connect-69f74b8665-jvnxh    1/2     Error              25 (6m25s ago)   131m

NAME                                         TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
service/confluentinc-cp-kafka-headless       ClusterIP   None             <none>        9092/TCP            131m
service/confluentinc-cp-zookeeper-headless   ClusterIP   None             <none>        2888/TCP,3888/TCP   131m
service/confluentinc-cp-zookeeper            ClusterIP   10.152.183.188   <none>        2181/TCP,5556/TCP   131m
service/confluentinc-cp-control-center       ClusterIP   10.152.183.32    <none>        9021/TCP            131m
service/confluentinc-cp-kafka-connect        ClusterIP   10.152.183.167   <none>        8083/TCP,5556/TCP   131m
service/confluentinc-cp-kafka-rest           ClusterIP   10.152.183.85    <none>        8082/TCP,5556/TCP   131m
service/confluentinc-cp-ksql-server          ClusterIP   10.152.183.142   <none>        8088/TCP,5556/TCP   131m
service/confluentinc-cp-kafka                ClusterIP   10.152.183.169   <none>        9092/TCP,5556/TCP   131m
service/confluentinc-cp-schema-registry      ClusterIP   10.152.183.119   <none>        8081/TCP,5556/TCP   131m

NAME                                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/confluentinc-cp-control-center    1/1     1            1           131m
deployment.apps/confluentinc-cp-ksql-server       1/1     1            1           131m
deployment.apps/confluentinc-cp-kafka-rest        1/1     1            1           131m
deployment.apps/confluentinc-cp-schema-registry   0/1     1            0           131m
deployment.apps/confluentinc-cp-kafka-connect     0/1     1            0           131m

NAME                                                        DESIRED   CURRENT   READY   AGE
replicaset.apps/confluentinc-cp-control-center-5cd976b498   1         1         1       131m
replicaset.apps/confluentinc-cp-ksql-server-5bb5d679d9      1         1         1       131m
replicaset.apps/confluentinc-cp-kafka-rest-58bd995bf4       1         1         1       131m
replicaset.apps/confluentinc-cp-schema-registry-d458757c9   1         1         0       131m
replicaset.apps/confluentinc-cp-kafka-connect-69f74b8665    1         1         0       131m

NAME                                         READY   AGE
statefulset.apps/confluentinc-cp-zookeeper   1/1     131m
statefulset.apps/confluentinc-cp-kafka       1/1     131m

Right now, these are my 2 issues:

For cp-schema-registry-server:

(io.confluent.kafka.schemaregistry.rest.SchemaRegistryConfig)
[2023-01-12 17:54:57,122] INFO Logging initialized @1102ms to org.eclipse.jetty.util.log.Slf4jLog (org.eclipse.jetty.util.log)
[2023-01-12 17:54:57,132] INFO Initial capacity 128, increased by 64, maximum capacity 2147483647. (io.confluent.rest.ApplicationServer)
[2023-01-12 17:54:57,273] INFO Adding listener: http://0.0.0.0:8081 (io.confluent.rest.ApplicationServer)
[2023-01-12 17:54:57,600] WARN Ignoring Kafka broker endpoint http://confluentinc-cp-kafka:9092 that does not match the setting for kafkastore.security.protocol=PLAINTEXT (io.confluent.kafka.schemaregistry.rest.SchemaRegistryConfig)
[2023-01-12 17:54:57,601] ERROR Server died unexpectedly:  (io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain)
org.apache.kafka.common.config.ConfigException: No supported Kafka endpoints are configured. Either kafkastore.bootstrap.servers must have at least one endpoint matching kafkastore.security.protocol or broker endpoints loaded from ZooKeeper must have at least one endpoint matching kafkastore.security.protocol.
        at io.confluent.kafka.schemaregistry.rest.SchemaRegistryConfig.endpointsToBootstrapServers(SchemaRegistryConfig.java:716)
        at io.confluent.kafka.schemaregistry.rest.SchemaRegistryConfig.bootstrapBrokers(SchemaRegistryConfig.java:651)
        at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.kafkaClusterId(KafkaSchemaRegistry.java:1280)
        at io.confluent.kafka.schemaregistry.storage.KafkaSchemaRegistry.<init>(KafkaSchemaRegistry.java:158)
        at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.initSchemaRegistry(SchemaRegistryRestApplication.java:69)
        at io.confluent.kafka.schemaregistry.rest.SchemaRegistryRestApplication.configureBaseApplication(SchemaRegistryRestApplication.java:88)
        at io.confluent.rest.Application.configureHandler(Application.java:255)
        at io.confluent.rest.ApplicationServer.doStart(ApplicationServer.java:227)
        at org.eclipse.jetty.util.component.AbstractLifeCycle.start(AbstractLifeCycle.java:72)
        at io.confluent.kafka.schemaregistry.rest.SchemaRegistryMain.main(SchemaRegistryMain.java:43)

For prometheus-jmx-exporter:

root@master:~/cp-helm-charts# kubectl logs pod/confluentinc-cp-kafka-connect-69f74b8665-jvnxh prometheus-jmx-exporter  -n kafka

Caused by: java.rmi.ConnectException: Connection refused to host: localhost; nested exception is: 
        java.net.ConnectException: Connection refused (Connection refused)
        at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:619)
        at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216)
        at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202)
        at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:338)
        at sun.rmi.registry.RegistryImpl_Stub.lookup(RegistryImpl_Stub.java:112)
        at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:132)
        ... 21 more
Caused by: java.net.ConnectException: Connection refused (Connection refused)
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at java.net.Socket.connect(Socket.java:538)
        at java.net.Socket.<init>(Socket.java:434)
        at java.net.Socket.<init>(Socket.java:211)
        at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40)
        at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:148)
        at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613)
        ... 26 more
OneCricketeer commented 1 year ago

You can ignore Prometheus since the Registry server isn't started.

Error here - WARN Ignoring Kafka broker endpoint http:// ; Kafka isn't an HTTP service, so remove the protocol

sfl0r3nz05 commented 1 year ago

Thanks again @OneCricketeer.

After changing the broker endpoint to confluentinc-cp-kafka:9092 the cluster works properly. Only, Prometheus containers are restarted due to the issue you mention.

root@master:~/cp-helm-charts# kubectl get all -n kafka
NAME                                                   READY   STATUS    RESTARTS      AGE
pod/confluentinc-cp-control-center-68b5f664d-wbg65     1/1     Running   0             6m10s
pod/confluentinc-cp-zookeeper-0                        2/2     Running   0             6m10s
pod/confluentinc-cp-kafka-0                            2/2     Running   0             6m10s
pod/confluentinc-cp-kafka-rest-7c5887d5fb-b7zzr        2/2     Running   0             6m10s
pod/confluentinc-cp-ksql-server-78487b764c-7mwt8       2/2     Running   1             6m10s
pod/confluentinc-cp-kafka-connect-5d68fc9947-xzzlr     2/2     Running   2 (79s ago)   6m10s
pod/confluentinc-cp-schema-registry-7747f7fd9c-wsl25   2/2     Running   2 (36s ago)   6m10s

NAME                                         TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)             AGE
service/confluentinc-cp-kafka-headless       ClusterIP   None             <none>        9092/TCP            6m10s
service/confluentinc-cp-zookeeper-headless   ClusterIP   None             <none>        2888/TCP,3888/TCP   6m10s
service/confluentinc-cp-kafka-rest           ClusterIP   10.152.183.169   <none>        8082/TCP,5556/TCP   6m10s
service/confluentinc-cp-zookeeper            ClusterIP   10.152.183.39    <none>        2181/TCP,5556/TCP   6m10s
service/confluentinc-cp-kafka-connect        ClusterIP   10.152.183.25    <none>        8083/TCP,5556/TCP   6m10s
service/confluentinc-cp-control-center       ClusterIP   10.152.183.23    <none>        9021/TCP            6m10s
service/confluentinc-cp-kafka                ClusterIP   10.152.183.28    <none>        9092/TCP,5556/TCP   6m10s
service/confluentinc-cp-ksql-server          ClusterIP   10.152.183.91    <none>        8088/TCP,5556/TCP   6m10s
service/confluentinc-cp-schema-registry      ClusterIP   10.152.183.247   <none>        8081/TCP,5556/TCP   6m10s

NAME                                              READY   UP-TO-DATE   AVAILABLE   AGE
deployment.apps/confluentinc-cp-control-center    1/1     1            1           6m10s
deployment.apps/confluentinc-cp-kafka-rest        1/1     1            1           6m10s
deployment.apps/confluentinc-cp-ksql-server       1/1     1            1           6m10s
deployment.apps/confluentinc-cp-kafka-connect     1/1     1            1           6m10s
deployment.apps/confluentinc-cp-schema-registry   1/1     1            1           6m10s

NAME                                                         DESIRED   CURRENT   READY   AGE
replicaset.apps/confluentinc-cp-control-center-68b5f664d     1         1         1       6m10s
replicaset.apps/confluentinc-cp-kafka-rest-7c5887d5fb        1         1         1       6m10s
replicaset.apps/confluentinc-cp-ksql-server-78487b764c       1         1         1       6m10s
replicaset.apps/confluentinc-cp-kafka-connect-5d68fc9947     1         1         1       6m10s
replicaset.apps/confluentinc-cp-schema-registry-7747f7fd9c   1         1         1       6m10s

NAME                                         READY   AGE
statefulset.apps/confluentinc-cp-zookeeper   1/1     6m10s
statefulset.apps/confluentinc-cp-kafka       1/1     6m10s

This is the configuration provided in the values.yml file:

## ------------------------------------------------------
## Zookeeper
## ------------------------------------------------------
cp-zookeeper:
  enabled: true
  servers: 1
  image: confluentinc/cp-zookeeper
  imageTag: 6.1.0
  ## Optionally specify an array of imagePullSecrets. Secrets must be manually created in the namespace.
  ## https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod
  imagePullSecrets:
  #  - name: "regcred"
  heapOptions: "-Xms512M -Xmx512M"
  persistence:
    enabled: false
    ## The size of the PersistentVolume to allocate to each Zookeeper Pod in the StatefulSet. For
    ## production servers this number should likely be much larger.
    ##
    ## Size for Data dir, where ZooKeeper will store the in-memory database snapshots.
    dataDirSize: 10Gi
    # dataDirStorageClass: ""
    ## Size for data log dir, which is a dedicated log device to be used, and helps avoid competition between logging and snaphots.
    dataLogDirSize: 10Gi
    # dataLogDirStorageClass: ""

  # TODO: find correct security context for user in this zk-image  
  securityContext: 
    runAsUser: 0

  resources: {}
  ## If you do want to specify resources, uncomment the following lines, adjust them as necessary,
  ## and remove the curly braces after 'resources:'
  #  limits:
  #   cpu: 100m
  #   memory: 128Mi
  #  requests:
  #   cpu: 100m
  #   memory: 128Mi

## ------------------------------------------------------
## Kafka
## ------------------------------------------------------
cp-kafka:
  enabled: true
  brokers: 1
  image: confluentinc/cp-enterprise-kafka
  imageTag: 6.1.0
  ## Optionally specify an array of imagePullSecrets. Secrets must be manually created in the namespace.
  ## https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod
  imagePullSecrets:
  #  - name: "regcred"
  heapOptions: "-Xms512M -Xmx512M"
  persistence:
    enabled: false
    #storageClass: ""
    size: 5Gi
    disksPerBroker: 1
  resources: {}
  ## If you do want to specify resources, uncomment the following lines, adjust them as necessary,
  ## and remove the curly braces after 'resources:'
  #  limits:
  #   cpu: 100m
  #   memory: 128Mi
  #  requests:
  #   cpu: 100m
  #   memory: 128Mi
  securityContext: 
    runAsUser: 0

## ------------------------------------------------------
## Schema Registry
## ------------------------------------------------------
cp-schema-registry:
  enabled: true
  image: confluentinc/cp-schema-registry
  imageTag: 6.1.0
  ## Optionally specify an array of imagePullSecrets. Secrets must be manually created in the namespace.
  ## https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod
  imagePullSecrets:
  #  - name: "regcred"
  heapOptions: "-Xms512M -Xmx512M"
  resources: {}
  ## If you do want to specify resources, uncomment the following lines, adjust them as necessary,
  ## and remove the curly braces after 'resources:'
  #  limits:
  #   cpu: 100m
  #   memory: 128Mi
                 #  requests:
  #   cpu: 100m
  #   memory: 128Mi
  kafka:
    bootstrapServers: "confluentinc-cp-kafka:9092"

## ------------------------------------------------------
## REST Proxy
## ------------------------------------------------------
cp-kafka-rest:
  enabled: true
  image: confluentinc/cp-kafka-rest
  imageTag: 6.1.0
  ## Optionally specify an array of imagePullSecrets. Secrets must be manually created in the namespace.
  ## https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod
  imagePullSecrets:
  #  - name: "regcred"
  heapOptions: "-Xms512M -Xmx512M"
  resources: {}
  ## If you do want to specify resources, uncomment the following lines, adjust them as necessary,
  ## and remove the curly braces after 'resources:'
  #  limits:
  #   cpu: 100m
  #   memory: 128Mi
  #  requests:
  #   cpu: 100m
  #   memory: 1 8Mi 
  cp-kafka:
    bootstrapServers: "confluentinc-cp-kafka:9092"

## ------------------------------------------------------
## Kafka Connect
## ------------------------------------------------------
cp-kafka-connect:
  enabled: true
  image: confluentinc/cp-kafka-connect
  imageTag: 6.1.0
  ## Optionally specify an array of imagePullSecrets. Secrets must be manually created in the namespace.
  ## https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod
  imagePullSecrets:
  #  - name: "regcred"
  heapOptions: "-Xms512M -Xmx512M"
  resources: {}
  ## If you do want to specify resources, uncomment the following lines, adjust them as necessary,
  ## and remove the curly braces after 'resources:'
  #  limits:
  #   cpu: 100m
  #   memory: 128Mi
  #  requests:
  #   cpu: 100m
  #   memory: 128Mi
  kafka:
    bootstrapServers: "confluentinc-cp-kafka:9092"

## ------------------------------------------------------
## KSQL Server
## ------------------------------------------------------
cp-ksql-server:
  enabled: true
  image: confluentinc/cp-ksqldb-server
  imageTag: 6.1.0
  ## Optionally specify an array of imagePullSecrets. Secrets must be manually created in the namespace.
  ## https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod
  imagePullSecrets:
  #  - name: "regcred"
  heapOptions: "-Xms512M -Xmx512M"
  ksql:
    headless: false
  kafka:
    bootstrapServers: "confluentinc-cp-kafka:9092"

## ------------------------------------------------------
## Control Center
## ------------------------------------------------------
cp-control-center:
  enabled: true
  image: confluentinc/cp-enterprise-control-center
  imageTag: 6.1.0
  ## Optionally specify an array of imagePullSecrets. Secrets must be manually created in the namespace.
  ## https://kubernetes.io/docs/concepts/containers/images/#specifying-imagepullsecrets-on-a-pod
  imagePullSecrets:
  #  - name: "regcred"
  heapOptions: "-Xms512M -Xmx512M"
  resources: {}
  ## If you do want to specify resources, uncomment the following lines, adjust them as necessary,
  ## and remove the curly braces after 'resources:'
  #  limits:
  #   cpu: 100m
  #   memory: 128Mi
  #  requests:
  #   cpu: 100m
  #   memory: 128Mi
  configurationOverrides:
    "replication.factor": "1"
  kafka:
    bootstrapServers: "confluentinc-cp-kafka:9092"
  cp-zookeeper:
  ## If the Zookeeper Chart is disabled a URL and port are required to connect
    url: "http://confluentinc-cp-zookeeper:2181"

Could you please recommend me a way to enable Prometheus to activate monitoring?

OneCricketeer commented 1 year ago

enable Prometheus to activate monitoring?

There's a separate project called prom-operator or kube-prometheus from CoreOS that defines ServiceMonitor resource types that you'd want

sfl0r3nz05 commented 1 year ago

Thanks, @OneCricketeer.

In the same values.yaml I have declared Prometheus like this for each entity:

  ## Monitoring
  ## Kafka JMX Settings
  ## ref: https://docs.confluent.io/current/kafka/monitoring.html
  jmx:
    port: 5555

  ## Prometheus Exporter Configuration
  ## ref: https://prometheus.io/docs/instrumenting/exporters/
  prometheus:
    ## JMX Exporter Configuration
    ## ref: https://github.com/prometheus/jmx_exporter
    jmx:
      enabled: true
      image: solsson/kafka-prometheus-jmx-exporter@sha256
      imageTag: 6f82e2b0464f50da8104acd7363fb9b995001ddff77d248379f8788e78946143
      port: 5556

      ## Resources configuration for the JMX exporter container.
      ## See the `resources` documentation above for details.
      resources: {}

However, the connection refuse error persist for the prometheus agents:

Caused by: java.net.ConnectException: Connection refused (Connection refused)
        at java.net.PlainSocketImpl.socketConnect(Native Method)
        at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
        at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
        at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
        at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
        at java.net.Socket.connect(Socket.java:589)
        at java.net.Socket.connect(Socket.java:538)
        at java.net.Socket.<init>(Socket.java:434)
        at java.net.Socket.<init>(Socket.java:211)
        at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40)
        at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:148)
        at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613)
        ... 26 more

To whom are these agents trying to connect? This entity is not enabled by default when deploying the whole cluster.

OneCricketeer commented 1 year ago

You've cut off the stacktrace. In your previous post, you have java.rmi.ConnectException: Connection refused to host.

You don't need to deploy Prometheus itself to get the JMX exporter working. Once it's working, you will have Service definitions that you can call /metrics endpoint over HTTP on the 5555 pod ports

OneCricketeer commented 1 year ago

As I've written in other issues, these Helm Charts are no longer maintained.

If you want to setup monitoring on Confluent components with Prometheus and Grafana using CFK, that's available in this repo https://github.com/confluentinc/confluent-kubernetes-examples/tree/master/monitoring/grafana-dashboard#monitoring