strimzi / strimzi-kafka-operator

Apache Kafka® running on Kubernetes
https://strimzi.io/
Apache License 2.0
4.86k stars 1.3k forks source link

[Question] kafka connect worker accessing problem with name #3294

Closed tirelibirefe closed 4 years ago

tirelibirefe commented 4 years ago

Hello; I need to your help again. I've just installed Kafka & KafkaConnect via Strimzi but KafkaConnect worker cannot be accessed via rest 8083 neither with name nor ip. If I set rest.advertised.host.name in kc config, operator says:

WARN AbstractConfiguration:129 - Configuration option "rest.advertised.host.name" is forbidden and will be ignored

The problem is described below:

devadmin@vdi-mk2-ubn:~$ kubectl get pods -n kafka
NAME                                                    READY   STATUS    RESTARTS   AGE
custom-kafka-connect-cluster-connect-6595dd78fd-55r6z   1/1     Running   0          7m26s
kafka-beytepe-entity-operator-697fd977fd-7bwfn          3/3     Running   0          2d
kafka-beytepe-kafka-0                                   2/2     Running   1          2d
kafka-beytepe-kafka-1                                   2/2     Running   2          2d
kafka-beytepe-kafka-2                                   2/2     Running   0          2d
kafka-beytepe-kafka-exporter-7cb9697f7c-rx64w           1/1     Running   1          2d
kafka-beytepe-zookeeper-0                               1/1     Running   1          2d
kafka-beytepe-zookeeper-1                               1/1     Running   0          2d
kafka-beytepe-zookeeper-2                               1/1     Running   0          2d
prometheus-prometheus-0                                 3/3     Running   1          2d
strimzi-cluster-operator-54565f8c56-bgmdw               1/1     Running   0          2d
tmp-shell                                               1/1     Running   0          39m
devadmin@vdi-mk2-ubn:~$ kubectl get svc -n kafka
NAME                                       TYPE        CLUSTER-IP      EXTERNAL-IP   PORT(S)                               AGE
custom-kafka-connect-cluster-connect-api   ClusterIP   10.233.20.11    <none>        8083/TCP,9404/TCP                     7m42s
kafka-beytepe-kafka-bootstrap              ClusterIP   10.233.11.125   <none>        9091/TCP,9092/TCP,9093/TCP,9404/TCP   2d
kafka-beytepe-kafka-brokers                ClusterIP   None            <none>        9091/TCP,9092/TCP,9093/TCP            2d
kafka-beytepe-kafka-exporter               ClusterIP   10.233.19.150   <none>        9404/TCP                              2d
kafka-beytepe-zookeeper-client             ClusterIP   10.233.0.16     <none>        9404/TCP,2181/TCP                     2d
kafka-beytepe-zookeeper-nodes              ClusterIP   None            <none>        2181/TCP,2888/TCP,3888/TCP            2d
prometheus-operated                        ClusterIP   None            <none>        9090/TCP                              2d

devadmin@vdi-mk2-ubn:~$ kubectl exec -i kafka-beytepe-kafka-0 -n kafka -- curl -X GET http://custom-kafka-connect-cluster-connect-api:8083/connectors | jq
Defaulting container name to kafka.
Use 'kubectl describe pod/kafka-beytepe-kafka-0 -n kafka' to see all of the containers in this pod.
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
  0     0    0     0    0     0      0      0 --:--:--  0:02:10 --:--:--     0curl: (7) Failed connect to custom-kafka-connect-cluster-connect-api:8083; Connection timed out
command terminated with exit code 7

devadmin@vdi-mk2-ubn:~$ kubectl exec -it custom-kafka-connect-cluster-connect-6595dd78fd-55r6z -n kafka -- bash
[kafka@custom-kafka-connect-cluster-connect-6595dd78fd-55r6z kafka]$ curl -X GET http://custom-kafka-connect-cluster-connect-api:8083/connectors
["mssql-files-connector"]
[kafka@custom-kafka-connect-cluster-connect-6595dd78fd-55r6z kafka]$ exit
exit

devadmin@vdi-mk2-ubn:~$ kubectl exec -it kafka-beytepe-kafka-0 -n kafka -- bash
Defaulting container name to kafka.
Use 'kubectl describe pod/kafka-beytepe-kafka-0 -n kafka' to see all of the containers in this pod.
[kafka@kafka-beytepe-kafka-0 kafka]$ curl -X GET http://custom-kafka-connect-cluster-connect-api:8083/connectors
curl: (7) Failed connect to custom-kafka-connect-cluster-connect-api:8083; Connection timed out
[kafka@kafka-beytepe-kafka-0 kafka]$ curl -X GET http://10.233.20.11:8083/connectors
curl: (7) Failed connect to 10.233.20.11:8083; Connection timed out
[kafka@kafka-beytepe-kafka-0 kafka]$ exit
exit
command terminated with exit code 127

devadmin@vdi-mk2-ubn:~$ kubectl run tmp-shell -n kafka --rm -i --tty --image nicolaka/netshoot -- /bin/bash
If you don't see a command prompt, try pressing enter.
ba[C-5.0# C[Cg ^Cstom-kafka-connect-cluster-connect-apii
bash-5.0# ping custom-kafka-connect-cluster-connect-api
PING custom-kafka-connect-cluster-connect-api.kafka.svc.cluster.local (10.233.20.11) 56(84) bytes of data.
64 bytes from custom-kafka-connect-cluster-connect-api.kafka.svc.cluster.local (10.233.20.11): icmp_seq=1 ttl=64 time=0.048 ms
64 bytes from custom-kafka-connect-cluster-connect-api.kafka.svc.cluster.local (10.233.20.11): icmp_seq=2 ttl=64 time=0.093 ms
64 bytes from custom-kafka-connect-cluster-connect-api.kafka.svc.cluster.local (10.233.20.11): icmp_seq=3 ttl=64 time=0.087 ms
^C
--- custom-kafka-connect-cluster-connect-api.kafka.svc.cluster.local ping statistics ---
3 packets transmitted, 3 received, 0% packet loss, time 2018ms
rtt min/avg/max/mdev = 0.048/0.076/0.093/0.019 ms
bash-5.0# telnet custom-kafka-connect-cluster-connect-api 8083

telnet: can't connect to remote host (10.233.20.11): Operation timed out
bash-5.0#
bash-5.0#

I am not able to find what I'm doing wrongly, could you please advise?

scholzj commented 4 years ago

Do you have Connector Operator enabled in your Kafka Connect? If yes, did you add network policy to allow access to your application?

tirelibirefe commented 4 years ago

I don't know what Connector Operator is... Sorry.

I use only Strimzi Cluster Operator just as before.

apiVersion: kafka.strimzi.io/v1beta1
kind: Kafka
metadata:
  name: kafka-beytepe
spec:
  kafka:
    version: 2.5.0
    replicas: 3
    listeners:
      plain: {}
      tls: {}
    readinessProbe:
      initialDelaySeconds: 15
      timeoutSeconds: 5
    livenessProbe:
...

and

apiVersion: kafka.strimzi.io/v1beta1
kind: KafkaConnect
metadata:
  name: custom-kafka-connect-cluster
  namespace: kafka
  labels:
    app: custom-kafka-connect-cluster
  annotations:
    strimzi.io/use-connector-resources: "true"
spec:
  version: 2.5.0
  image: harbor.dev1.kik.io/kafka/kc:v0.1
  replicas: 1
  bootstrapServers: kafka-beytepe-kafka-bootstrap:9093
  externalConfiguration:
...
scholzj commented 4 years ago

Connector Operator is the a process inside the Cluster Operator. It enabled the use of the KafkaConnector resource to manage connectors in connect. It is enabled by this annotation which you have in your KafkaConnect resourced:

  annotations:
    strimzi.io/use-connector-resources: "true"

As part of enabling it, some parts of the REST API should not be used (e.g. creating connectors, editing connectors, pausing connectors) since anything you do that way will be reverted by the operator. Because the connector operator needs access to the Connect API, it needs to set a network policy. If you want to use the connector operator and still also access the API, you will need to create your own policy to give your applications access to the REST API. If you want to use only the REST API, and not the KafkaConnector resources, you should just remove the annotation above.

You can have a look for example here https://github.com/strimzi/strimzi-kafka-operator/issues/3143 how the netork policy should look like - you will need to adjust it to your own names etc. Just make sure to not edit the existing policy created by the operator and to give it your own name.

tirelibirefe commented 4 years ago

...but; my connector operator is enabled. my brooker and connector pods and services are all in the same namespace, so they no need to a network policy to communicate with each other. #3143 is an example for ingress, I am not trying to connect kafkaconnect through ingress.

scholzj commented 4 years ago

The Ingress in the network policy means that it controls who can connect into the Kafka Connect pod and not where the Kafka Connect pod can connect (that would be Egress). The other components also have network policies and for example they would prevent you from accessing Zookeeper as well (among other things - Zoo has also TLS authentication).

tirelibirefe commented 4 years ago

I've just noticed; clusteroperator/connectoperator already created network policy and there is no any restriction:

devadmin@vdi-mk2-ubn:~$ kubectl get networkpolicy -n kafka
NAME                                     POD-SELECTOR                                                                                                                        AGE
custom-kafka-connect-cluster-connect     strimzi.io/cluster=custom-kafka-connect-cluster,strimzi.io/kind=KafkaConnect,strimzi.io/name=custom-kafka-connect-cluster-connect   159m
kafka-beytepe-network-policy-kafka       strimzi.io/name=kafka-beytepe-kafka                                                                                                 2d3h
kafka-beytepe-network-policy-zookeeper   strimzi.io/name=kafka-beytepe-zookeeper                                                                                             2d3h
devadmin@vdi-mk2-ubn:~$ kubectl describe networkpolicy custom-kafka-connect-cluster-connect -n kafka
Name:         custom-kafka-connect-cluster-connect
Namespace:    kafka
Created on:   2020-07-08 20:37:07 +0300 +03
Labels:       app=custom-kafka-connect-cluster
              app.kubernetes.io/instance=custom-kafka-connect-cluster
              app.kubernetes.io/managed-by=strimzi-cluster-operator
              app.kubernetes.io/name=kafka-connect
              app.kubernetes.io/part-of=strimzi-custom-kafka-connect-cluster
              strimzi.io/cluster=custom-kafka-connect-cluster
              strimzi.io/kind=KafkaConnect
              strimzi.io/name=strimzi
Annotations:  <none>
Spec:
  PodSelector:     strimzi.io/cluster=custom-kafka-connect-cluster,strimzi.io/kind=KafkaConnect,strimzi.io/name=custom-kafka-connect-cluster-connect
  Allowing ingress traffic:
    To Port: 8083/TCP
    From:
      PodSelector: strimzi.io/cluster=custom-kafka-connect-cluster,strimzi.io/kind=KafkaConnect,strimzi.io/name=custom-kafka-connect-cluster-connect
    From:
      NamespaceSelector: <none>
      PodSelector: strimzi.io/kind=cluster-operator
    ----------
    To Port: 9404/TCP
    From: <any> (traffic not restricted by source)
  Not affecting egress traffic
  Policy Types: Ingress
devadmin@vdi-mk2-ubn:~$ kubectl get pods -n kafka -l strimzi.io/cluster=custom-kafka-connect-cluster
NAME                                                    READY   STATUS    RESTARTS   AGE
custom-kafka-connect-cluster-connect-6595dd78fd-55r6z   1/1     Running   0          161m
devadmin@vdi-mk2-ubn:~$
scholzj commented 4 years ago

Thsi is the restriction:

    To Port: 8083/TCP
    From:
      PodSelector: strimzi.io/cluster=custom-kafka-connect-cluster,strimzi.io/kind=KafkaConnect,strimzi.io/name=custom-kafka-connect-cluster-connect
    From:
      NamespaceSelector: <none>
      PodSelector: strimzi.io/kind=cluster-operator

This say that only pods matching the selector strimzi.io/kind=cluster-operator or strimzi.io/cluster=custom-kafka-connect-cluster,strimzi.io/kind=KafkaConnect,strimzi.io/name=custom-kafka-connect-cluster-connect can connect to port 8083. And that is why you need to create your own network policy which would also allow your pods which need the access.

tirelibirefe commented 4 years ago

It worked! Thank you very much @scholzj you helped me a lot as usual! I'm very appreciated.

hafizmujadidKhalid commented 1 year ago

Hey @scholzj! sorry bothering you so much these as I get stuck on different issues. Now, I am getting a similar issue. When I deploy Kafka connect, it is failing with the following exception.

2023-03-29 19:34:46,297 INFO [Worker clientId=connect-1, groupId=my-connect] Successfully joined group with generation Generation{generationId=39658, memberId='connect-1-641a814a-0e5f-4389-b0c6-cbbcab847518', protocol='sessioned'} (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator) [DistributedHerder-connect-1-1] 2023-03-29 19:34:46,300 INFO [Worker clientId=connect-1, groupId=my-connect] Successfully synced group in generation Generation{generationId=39658, memberId='connect-1-641a814a-0e5f-4389-b0c6-cbbcab847518', protocol='sessioned'} (org.apache.kafka.connect.runtime.distributed.WorkerCoordinator) [DistributedHerder-connect-1-1] 2023-03-29 19:34:46,300 INFO [Worker clientId=connect-1, groupId=my-connect] Joined group at generation 39658 with protocol version 2 and got assignment: Assignment{error=0, leader='connect-1-8a3b58e7-217b-4414-9105-ab870c9937e6', leaderUrl='http://10.33.155.47:8083/', offset=1, connectorIds=[], taskIds=[s3-sink-connector-0], revokedConnectorIds=[], revokedTaskIds=[], delay=0} with rebalance delay: 0 (org.apache.kafka.connect.runtime.distributed.DistributedHerder) [DistributedHerder-connect-1-1] 2023-03-29 19:34:46,300 INFO [Worker clientId=connect-1, groupId=my-connect] Starting connectors and tasks using config offset 1 (org.apache.kafka.connect.runtime.distributed.DistributedHerder) [DistributedHerder-connect-1-1] 2023-03-29 19:34:46,300 INFO [Worker clientId=connect-1, groupId=my-connect] Finished starting connectors and tasks (org.apache.kafka.connect.runtime.distributed.DistributedHerder) [DistributedHerder-connect-1-1] 2023-03-29 19:34:47,824 ERROR IO error forwarding REST request: (org.apache.kafka.connect.runtime.rest.RestClient) [qtp1239728853-20] java.util.concurrent.ExecutionException: java.net.SocketTimeoutException: Connect Timeout

I even added the network policy with full complete open ingress/egress rules. I am deploying it on EKS inside the private subnet. Do you have some suggestions for me? thanks

scholzj commented 1 year ago

If you have multiple Kafka Connect clusters deploeyd, make sure they each use different topics and group. Otherwise they mix into one big cluster. That is all what comes to my mind based on this snippet.

hafizmujadidKhalid commented 1 year ago

Thanks, that was indeed the problem.