bitnami / charts

Bitnami Helm Charts
https://bitnami.com
Other
8.81k stars 9.1k forks source link

[bitnami/kafka] Issue while enabling external connectivity for Kafka brokers by using Cluster IP option. (Brokers are exposed to different ports rather than assigned to a defined port) #25192

Open anuragwork1 opened 4 months ago

anuragwork1 commented 4 months ago

Name and Version

bitnami/kafka 3.7.0

What architecture are you using?

amd64

What steps will reproduce the bug?

Deployed Kafka with 3 brokers through latest Bitnami chart (28.0.3), by using below values.

  values:
    global:
      storageClass: ceph-block
    listeners:
      client:
        containerPort: 9092
        protocol: PLAINTEXT
      controller:
        name: CONTROLLER
        containerPort: 9093
        protocol: PLAINTEXT
      interbroker:
        containerPort: 9094
        protocol: PLAINTEXT
      external:
        containerPort: 9095
        protocol: PLAINTEXT
    sasl:
      enabledMechanisms: PLAIN,SCRAM-SHA-256,SCRAM-SHA-512
      interBrokerMechanism: PLAIN
      controllerMechanism: PLAIN
    controller.persistence.size: 200Gi
    controller.logPersistence.size: 200Gi
    broker.persistence.size: 200Gi
    broker.logPersistence.size: 200Gi
    externalAccess:
      enabled: true
      autoDiscovery:
        enabled: false
      controller:
        service:
          type: ClusterIP
          domain: kafka.onprem.yusen-logistics.io
          ports:
            external: 9094
      broker:
        service:
          type: ClusterIP
          domain: kafka.onprem.yusen-logistics.io
          ports:
            external: 9094
    kraft:
      enabled: true
      clusterId: ""

What is the expected behavior?

As expected three service objects got created for each broker as below. And so expecting all three kafka brokers should be advertising external listener on same port (9094).

image

What do you see instead?

Ideally as per the values defined it should create an external listener which advertises on port 9094 on all 3 brokers, but in my case I observed it created advertise listeners on diff ports (9094,9095,9096).

image

image

image

Because of this when our application try to connect Kafka brokers by using broker dns record as kafka.onprem.yusen-logistics.io:9094, kafka.onprem.yusen-logistics.io:9095, kafka.onprem.yusen-logistics.io:9096 we are getting below error message.

kafka.onprem.yusen-logistics.io:9096: Connect to ipv4#172.28.203.24:9096 failed: Unknown error (after 14746ms in state CONNECT)

Additional information

I have also configured Nginx TCP proxy pass for the brokers.

image

KennedyNNNN commented 4 months ago

Hi @javsalgar is there any update on this issue? Could you help expedite a resolution on this https://github.com/bitnami/charts/issues/25192?

jotamartos commented 4 months ago

Hi,

I've been checking the configuration and saw that this is generated by the EXTERNAL_ACCESS_PORT_AUTOINCREMENT env var

https://github.com/bitnami/charts/blob/main/bitnami/kafka/templates/_helpers.tpl#L809

and the configure_external_access method in the configmap

https://github.com/bitnami/charts/blob/main/bitnami/kafka/templates/scripts-configmap.yaml#L150

Would you like to contribute? You can follow our contributing guidelines and the whole community will benefit from this change. Our team will be more than happy to review the changes.

In the meantime, I'm going to create a task on our side to evaluate the changes but I can't provide you with an ETA on when they will be ready.

zeeshan018 commented 3 months ago

@anuragwork1 @jotamartos

I'm facing same issue, but I have set up only one Kafka brokers. When I try to access it from outside the cluster, I get this error: Disconnected node {-1}"

Please help me if someone fix this

luischre commented 7 hours ago

While the proposed change, might be feasible, I don't think that this is the root cause here. Since the external connect via ClusterIP had been introduced, the behaviour was always like that (https://github.com/bitnami/charts/pull/11853), there was always the increment of 1.

I think the underlying problem is that kafka ports have changed in general. Back then the external kafka port was 9094, however, this is by default no longer the case. The external port now is 9095 but the according part in the README has never been updated. Hence when following the example, it won't work.

I suspect that since this silent port change, the cluster IP example broke: https://github.com/bitnami/charts/commit/f1aec563c28f27f575849a756c33a7fb84266d40#diff-2c115953359b04aff3daf4aea03446107d3c67a799658a7585a5f4d9a490581cR239

In my case this surfaced when updating to release 23.0.7. As I wanted to preserve the externalPort on 9094, I went with adjusting the containerPorts. Please note that I am still a bit behind hence the exact keys might be different now. In our case it was not only the external port causing problems but actually also the change of the internal port as well.

containerPorts:
    internal: 9093
    external: 9094
    controller: 9097

service:
  ports:
    internal: 9093
    external: 9094
    controller: 9097

Probably the easier way would be to follow the ClusterIP example and whereever 9094 is mentioned, to use 9095 but that I didn't try so far. In any case I would argue that the examples in the README should be adjusted and I would argue that the changing of ports would have deserved a bigger disclaimer in the README, than only this innocent looking phrase:

Updated the containerPorts and service.ports parameters to include the new controller port.

I kind of assume that also this open issues is related to the same change: https://github.com/bitnami/charts/issues/18419

Hopefully this might help some of you :)