strimzi / strimzi-kafka-operator

Apache Kafka® running on Kubernetes
https://strimzi.io/
Apache License 2.0
4.86k stars 1.3k forks source link

Strimzi standalone topic operator not working - Openshift #2607

Closed soumochak83 closed 4 years ago

soumochak83 commented 4 years ago

I deployed Strimzi topic operator on Openshift using the following documentation: https://strimzi.io/docs/0.12.2/full.html#deploying-the-topic-operator-standalone-deploying. The cluster operator works fine and topic operator fails to run and error shown below:

2020-02-26 13:40:30 INFO ZooKeeper:100 - Client environment:java.library.path=/usr/java/packages/lib/amd64:/usr/lib64:/lib64:/lib:/usr/libenter code here 2020-02-26 13:40:30 INFO ZooKeeper:100 - Client environment:java.io.tmpdir=/tmp 2020-02-26 13:40:30 INFO ZooKeeper:100 - Client environment:java.compiler= 2020-02-26 13:40:30 INFO ZooKeeper:100 - Client environment:os.name=Linux 2020-02-26 13:40:30 INFO ZooKeeper:100 - Client environment:os.arch=amd64 2020-02-26 13:40:30 INFO ZooKeeper:100 - Client environment:os.version=3.10.0-1062.12.1.el7.x86_64 2020-02-26 13:40:30 INFO ZooKeeper:100 - Client environment:user.name=? 2020-02-26 13:40:30 INFO ZooKeeper:100 - Client environment:user.home=? 2020-02-26 13:40:30 INFO ZooKeeper:100 - Client environment:user.dir=/opt/strimzi 2020-02-26 13:40:30 INFO ZooKeeper:442 - Initiating client connection, connectString=172.30.93.216:2181 sessionTimeout=500000 watcher=org.I0Itec.zkclient.ZkClient@74e374fc 2020-02-26 13:40:31 INFO ZkClient:936 - Waiting for keeper state SyncConnected 2020-02-26 13:40:31 INFO ClientCnxn:1025 - Opening socket connection to server soumo-zookeeper-client.kube-system.svc.cluster.local/172.30.93.216:2181. Will not attempt to authenticate using SASL (unknown error)

2020-02-26 13:40:31 INFO ClientCnxn:879 - Socket connection established to soumo-zookeeper-client.kube-system.svc.cluster.local/172.30.93.216:2181, initiating session 2020-02-26 13:40:31 WARN ClientCnxn:1164 - Session 0x0 for server soumo-zookeeper-client.kube-system.svc.cluster.local/172.30.93.216:2181, unexpected error, closing socket connection and attempting reconnect java.io.IOException: Connection reset by peer at sun.nio.ch.FileDispatcherImpl.read0(Native Method) ~[?:1.8.0_242] at sun.nio.ch.SocketDispatcher.read(SocketDispatcher.java:39) ~[?:1.8.0_242] at sun.nio.ch.IOUtil.readIntoNativeBuffer(IOUtil.java:223) ~[?:1.8.0_242] at sun.nio.ch.IOUtil.read(IOUtil.java:192) ~[?:1.8.0_242] at sun.nio.ch.SocketChannelImpl.read(SocketChannelImpl.java:377) ~[?:1.8.0_242] at org.apache.zookeeper.ClientCnxnSocketNIO.doIO(ClientCnxnSocketNIO.java:68) ~[org.apache.zookeeper.zookeeper-3.4.14.jar:3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf] at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:366) ~[org.apache.zookeeper.zookeeper-3.4.14.jar:3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf] at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141) [org.apache.zookeeper.zookeeper-3.4.14.jar:3.4.14-4c25d480e66aadd371de8bd2fd8da255ac140bcf] 2020-02-26 13:40:34 INFO ClientCnxn:1025 - Opening socket connection to server soumo-zookeeper-client.kube-system.svc.cluster.local/172.30.93.216:2181. Will not attempt to authenticate using SASL (unknown error) 2020-02-26 13:40:34 INFO ClientCnxn:879 - Socket connection established to soumo-zookeeper-client.kube-system.svc.cluster.local/172.30.93.216:2181, initiating session 2020-02-26 13:40:34 WARN ClientCnxn:1164 - Session 0x0 for server soumo-zookeeper-client.kube-system.svc.cluster.local/172.30.93.216:2181, unexpected error, closing socket connection and attempting reconnect java.io.IOException: Connection reset by peer

scholzj commented 4 years ago

The standalone topic operator is not supposed to be used with Kafka cluster deployed by the CLuster operator. It is supposed to be used with Kafka cluster not managed by Strimzi. If you want to use the Topic Operator with the Kafka cluster deployed by Strimzi, you have to deploy the topic operator as part of it. This is done in the entityOperator section:

# ...
entittyOperator:
  topicOperator: {}
  userOperator: {}

See more in https://strimzi.io/docs/master/full.html#deploying-the-topic-operator-using-the-cluster-operator-deploying

soumochak83 commented 4 years ago

You're correct, would you please share the config parameters that I need to put under the topicOperator: { } under inside operator?

scholzj commented 4 years ago

You actually do not need to add anything. You can just leave it be as topicOperator: {}. Here is a list of all the available options you can use, but you can just leave them out: https://strimzi.io/docs/master/full.html#type-EntityTopicOperatorSpec-reference

soumochak83 commented 4 years ago

I deployed the topic operator using cluster operator by running the kafka resource (examples/kafka-epehemeral.yaml) in a custom project (soumo). I see the containers coming up fine for kafka and zookeeper however I see there is an error for topic-operator container which is as below:

[2020-02-26 20:15:17,179] WARN [s-ops-tool-0] The client is using resource type 'kafkatopics' with unstable version 'v1beta1' [2020-02-26 20:15:17,416] WARN [2.30.0.1/...] Exec Failure: HTTP 404, Status: 404 - 404 page not found

java.net.ProtocolException: Expected HTTP 101 response but was '404 Not Found' at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:229) [com.squareup.okhttp3.okhttp-3.12.0.jar:?] at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:196) [com.squareup.okhttp3.okhttp-3.12.0.jar:?]


NAME READY STATUS RESTARTS AGE soumo-entity-operator-867d4765d-96lcw 2/3 Running 1 49s soumo-kafka-0 2/2 Running 0 1m soumo-kafka-1 2/2 Running 0 1m soumo-kafka-2 2/2 Running 0 1m soumo-zookeeper-0 2/2 Running 0 1m soumo-zookeeper-1 2/2 Running 0 1m soumo-zookeeper-2 2/2 Running 0 1m strimzi-cluster-operator-6dc9c8cd5f-2jdjg 1/1 Running 0 7d

soumochak83 commented 4 years ago

However on the kube-system project, I see kafka, zookeeper and entity operator is already running. Does that means if I deploy the cluster operator, the entity operator (including topic operator and user operator) get auto deployed?

[soumocha@inmbzp7171 kafka]sudo oc get pods -n kube-system NAME READY STATUS RESTARTS AGE master-api-inmbzp7171.in.dst.ibm.com 1/1 Running 9 20d master-controllers-inmbzp7171.in.dst.ibm.com 1/1 Running 10 20d master-etcd-inmbzp7171.in.dst.ibm.com 1/1 Running 9 20d soumo-entity-operator-78d8c4c4d6-vh9nf 3/3 Running 0 7d soumo-kafka-0 2/2 Running 0 7d soumo-kafka-1 2/2 Running 0 7d soumo-kafka-2 2/2 Running 0 7d soumo-zookeeper-0 2/2 Running 0 7d soumo-zookeeper-1 2/2 Running 0 7d soumo-zookeeper-2 2/2 Running 0 7d

scholzj commented 4 years ago

This:

[2020-02-26 20:15:17,179] WARN onUsageUtils:60 [s-ops-tool-0] The client is using resource type 'kafkatopics' with unstable version 'v1beta1'
[2020-02-26 20:15:17,416] WARN ctionManager:202 [2.30.0.1/...] Exec Failure: HTTP 404, Status: 404 - 404 page not found

java.net.ProtocolException: Expected HTTP 101 response but was '404 Not Found'
at okhttp3.internal.ws.RealWebSocket.checkResponse(RealWebSocket.java:229) [com.squareup.okhttp3.okhttp-3.12.0.jar:?]
at okhttp3.internal.ws.RealWebSocket$2.onResponse(RealWebSocket.java:196) [com.squareup.okhttp3.okhttp-3.12.0.jar:?]

Suggests the KAfkaTopic CRD is not installed. Didn't you deleted it by mistake when deleting the standalone Topic Operator? If you do kubectl get crd. can you see it there?

soumochak83 commented 4 years ago

I see that as below:

sudo kubectl get crd NAME CREATED AT alertmanagers.monitoring.coreos.com 2020-02-06T13:06:44Z apmservers.apm.k8s.elastic.co 2020-02-17T14:17:01Z bundlebindings.automationbroker.io 2020-02-06T13:10:44Z bundleinstances.automationbroker.io 2020-02-06T13:10:45Z bundles.automationbroker.io 2020-02-06T13:10:52Z catalogsources.operators.coreos.com 2020-02-18T11:02:33Z clusterserviceversions.operators.coreos.com 2020-02-18T11:02:33Z elasticsearches.elasticsearch.k8s.elastic.co 2020-02-17T14:17:01Z installplans.operators.coreos.com 2020-02-18T11:02:33Z kafkabridges.kafka.strimzi.io 2020-02-19T11:54:41Z kafkaconnectors.kafka.strimzi.io 2020-02-19T11:54:41Z kafkaconnects.kafka.strimzi.io 2020-02-19T11:54:41Z kafkaconnects2is.kafka.strimzi.io 2020-02-19T11:54:41Z kafkamirrormakers.kafka.strimzi.io 2020-02-19T11:54:41Z kafkas.kafka.strimzi.io 2020-02-19T11:54:41Z kafkausers.kafka.strimzi.io 2020-02-19T11:54:41Z kibanas.kibana.k8s.elastic.co 2020-02-17T14:17:01Z operatorgroups.operators.coreos.com 2020-02-18T11:02:33Z prometheuses.monitoring.coreos.com 2020-02-06T13:06:44Z prometheusrules.monitoring.coreos.com 2020-02-06T13:06:44Z servicemonitors.monitoring.coreos.com 2020-02-06T13:06:44Z subscriptions.operators.coreos.com 2020-02-18T11:02:33Z

scholzj commented 4 years ago

Right, you are missing the KafkaTopics CRD. You need to install it. It is in the installation folder under 043-Crd-kafkatopic.yaml. So just do kubectl apply -f on it.

soumochak83 commented 4 years ago

great, do not see any errors on the topic-operator container. My question is, will entity operator container be created automatically when cluster operator is deployed?

If yes, then why I need to run again the topic-operator when I already have the entity operator container running?

scholzj commented 4 years ago

No, the entity operator container is deployed together with the Kafka custom resource. If you delete the custom resource, it will be deleted as well.

soumochak83 commented 4 years ago

Little confusing.....well, now I've cleaned up everything (cluster and topic operator) and deploying the cluster operator from beginning. I followed the steps:

  1. sudo sed -i 's/namespace: ./namespace: soumo/' cluster-operator/RoleBinding*.yaml
  2. sudo oc apply -f install/cluster-operator -n soumo
  3. sudo oc apply -f examples/templates/cluster-operator -n soumo
  4. sudo oc adm policy add-cluster-role-to-user strimzi-cluster-operator-namespaced --serviceaccount strimzi-cluster-operator -n soumo
  5. sudo oc adm policy add-cluster-role-to-user strimzi-entity-operator --serviceaccount strimzi-cluster-operator -n soumo
  6. sudo oc adm policy add-cluster-role-to-user strimzi-topic-operator --serviceaccount strimzi-cluster-operator -n soumo
  7. sudo oc apply -f install/cluster-operator -n soumo
  8. sudo oc apply -f examples/kafka/kafka-ephemeral.yaml -n soumo

After I execute the above 8 steps, I see the below containers running:

soumocha@inmbzp7171 strimzi-0.16.2]$ sudo oc get pods -n soumo NAME READY STATUS RESTARTS AGE soumo-entity-operator-867d4765d-4vl9s 3/3 Running 0 1m soumo-kafka-0 2/2 Running 0 1m soumo-kafka-1 2/2 Running 0 1m soumo-kafka-2 2/2 Running 0 1m soumo-zookeeper-0 2/2 Running 0 2m soumo-zookeeper-1 2/2 Running 0 2m soumo-zookeeper-2 2/2 Running 0 2m strimzi-cluster-operator-6dc9c8cd5f-m8mzz 1/1 Running 0 5m

Now - should I again re-run step 8 to deploy topic-operator which is already deployed and running as seen above?

scholzj commented 4 years ago

Yes, and the entity operator was deployed as a result of the step 8 - oc apply -f examples/kafka/kafka-ephemeral.yaml.

soumochak83 commented 4 years ago

Ok. So with that - can I conclude my topic operator is ready now?

scholzj commented 4 years ago

Yes, it should be ready now.

soumochak83 commented 4 years ago

great help to fix my issue! Thanks a ton @scholzj

ohlen1 commented 5 months ago

The standalone topic operator is not supposed to be used with Kafka cluster deployed by the CLuster operator. It is supposed to be used with Kafka cluster not managed by Strimzi. If you want to use the Topic Operator with the Kafka cluster deployed by Strimzi, you have to deploy the topic operator as part of it. This is done in the entityOperator section:

# ...
entittyOperator:
  topicOperator: {}
  userOperator: {}

See more in https://strimzi.io/docs/master/full.html#deploying-the-topic-operator-using-the-cluster-operator-deploying

Hi,

Sorry for writing in a since long closed issue. But this is the first time I hear about the standalone topic and user operators not being supported with a Kafka cluster created by Strimzi. If this is still true, I think it need to be made clearer in the docs.

In our case we were aiming against setting up the Kafka cluster in K8S cluster A, using the Strimzi cluster operator. In K8S cluster B we then wanted to deploy the standalone topic and user operators, targeting the Kafka cluster in K8S cluster A. I don't see why the operators should care about or even know if the target cluster is managed by Strimzi or not?

All the best!

scholzj commented 5 months ago

I suggest you start a new discussion and not comment on a closed issue that is over 4 years old.