confluentinc / cp-helm-charts

The Confluent Platform Helm charts enable you to deploy Confluent Platform services on Kubernetes for development, test, and proof of concept environments.
https://cnfl.io/getting-started-kafka-kubernetes
Apache License 2.0
788 stars 846 forks source link

a container name must be specified for pod my-confluent-cp-kafka-0, choose one of: [prometheus-jmx-exporter cp-kafka-broker] #451

Closed little-snow-fox closed 3 years ago

little-snow-fox commented 4 years ago

Execute:

helm install my-confluent confluentinc/cp-helm-charts  

Results:

default         my-confluent-cp-control-center-58b98c57d6-dvvr4    0/1     CrashLoopBackOff   4          3m12s
default         my-confluent-cp-kafka-0                            0/2     Pending            0          3m12s
default         my-confluent-cp-kafka-connect-7b7df9c96c-jd9ls     1/2     CrashLoopBackOff   4          3m12s
default         my-confluent-cp-kafka-rest-7f8f844548-kw8xf        2/2     Running            3          3m12s
default         my-confluent-cp-ksql-server-8468d798b5-jt4fn       1/2     CrashLoopBackOff   4          3m12s
default         my-confluent-cp-schema-registry-66496455db-lz5gj   1/2     CrashLoopBackOff   4          3m12s
default         my-confluent-cp-zookeeper-0                        0/2     Pending            0          3m12s

Log:

➜  helm git:(master) ✗ kubectl logs -f my-confluent-cp-kafka-rest-7f8f844548-kw8xf                                       
Error from server (BadRequest): a container name must be specified for pod my-confluent-cp-kafka-rest-7f8f844548-kw8xf, choose one of: [prometheus-jmx-exporter cp-kafka-rest-server]
➜  helm git:(master) ✗ kubectl logs -f my-confluent-cp-kafka-0                                           
Error from server (BadRequest): a container name must be specified for pod my-confluent-cp-kafka-0, choose one of: [prometheus-jmx-exporter cp-kafka-broker]
➜  helm git:(master) ✗ kubectl logs -f my-confluent-cp-zookeeper-0                       
Error from server (BadRequest): a container name must be specified for pod my-confluent-cp-zookeeper-0, choose one of: [prometheus-jmx-exporter cp-zookeeper-server]
➜  helm git:(master) ✗ kubectl logs -f my-confluent-cp-ksql-server-8468d798b5-jt4fn                           
Error from server (BadRequest): a container name must be specified for pod my-confluent-cp-ksql-server-8468d798b5-jt4fn, choose one of: [prometheus-jmx-exporter cp-ksql-server]

How can I make it work properly?

michaelswierszcz commented 4 years ago

kubectl logs my-confluent-cp-kafka-rest-7f8f844548-kw8xf [CONTAINER]

Example kubectl logs my-confluent-cp-kafka-rest-7f8f844548-kw8xf prometheus-jmx-exporter or kubectl logs my-confluent-cp-kafka-rest-7f8f844548-kw8xf cp-kafka-rest-server

little-snow-fox commented 4 years ago

Log:

➜  kubectl logs -f my-confluent-cp-kafka-rest-7f8f844548-bjrpf   
Error from server (BadRequest): a container name must be specified for pod my-confluent-cp-kafka-rest-7f8f844548-bjrpf, choose one of: [prometheus-jmx-exporter cp-kafka-rest-server]

➜ kubectl logs -f my-confluent-cp-kafka-rest-7f8f844548-bjrpf prometheus-jmx-exporter  
VM settings:
    Max. Heap Size (Estimated): 26.67G
    Ergonomics Machine Class: server
    Using VM: OpenJDK 64-Bit Server VM

Jul 10, 2020 12:35:34 AM io.prometheus.jmx.JmxCollector collect
SEVERE: JMX scrape failed: java.io.IOException: Failed to retrieve RMIServer stub: javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: localhost; nested exception is: 
    java.net.ConnectException: Connection refused (Connection refused)]
    at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:369)
    at javax.management.remote.JMXConnectorFactory.connect(JMXConnectorFactory.java:270)
    at io.prometheus.jmx.JmxScraper.doScrape(JmxScraper.java:94)
    at io.prometheus.jmx.JmxCollector.collect(JmxCollector.java:456)
    at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.findNextElement(CollectorRegistry.java:183)
    at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:216)
    at io.prometheus.client.CollectorRegistry$MetricFamilySamplesEnumeration.nextElement(CollectorRegistry.java:137)
    at io.prometheus.client.exporter.common.TextFormat.write004(TextFormat.java:22)
    at io.prometheus.client.exporter.HTTPServer$HTTPMetricHandler.handle(HTTPServer.java:59)
    at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
    at sun.net.httpserver.AuthFilter.doFilter(AuthFilter.java:83)
    at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:82)
    at sun.net.httpserver.ServerImpl$Exchange$LinkHandler.handle(ServerImpl.java:675)
    at com.sun.net.httpserver.Filter$Chain.doFilter(Filter.java:79)
    at sun.net.httpserver.ServerImpl$Exchange.run(ServerImpl.java:647)
    at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
    at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
    at java.lang.Thread.run(Thread.java:748)
Caused by: javax.naming.ServiceUnavailableException [Root exception is java.rmi.ConnectException: Connection refused to host: localhost; nested exception is: 
    java.net.ConnectException: Connection refused (Connection refused)]
    at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:136)
    at com.sun.jndi.toolkit.url.GenericURLContext.lookup(GenericURLContext.java:205)
    at javax.naming.InitialContext.lookup(InitialContext.java:417)
    at javax.management.remote.rmi.RMIConnector.findRMIServerJNDI(RMIConnector.java:1955)
    at javax.management.remote.rmi.RMIConnector.findRMIServer(RMIConnector.java:1922)
    at javax.management.remote.rmi.RMIConnector.connect(RMIConnector.java:287)
    ... 17 more
Caused by: java.rmi.ConnectException: Connection refused to host: localhost; nested exception is: 
    java.net.ConnectException: Connection refused (Connection refused)
    at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:619)
    at sun.rmi.transport.tcp.TCPChannel.createConnection(TCPChannel.java:216)
    at sun.rmi.transport.tcp.TCPChannel.newConnection(TCPChannel.java:202)
    at sun.rmi.server.UnicastRef.newCall(UnicastRef.java:338)
    at sun.rmi.registry.RegistryImpl_Stub.lookup(RegistryImpl_Stub.java:112)
    at com.sun.jndi.rmi.registry.RegistryContext.lookup(RegistryContext.java:132)
    ... 22 more
Caused by: java.net.ConnectException: Connection refused (Connection refused)
    at java.net.PlainSocketImpl.socketConnect(Native Method)
    at java.net.AbstractPlainSocketImpl.doConnect(AbstractPlainSocketImpl.java:350)
    at java.net.AbstractPlainSocketImpl.connectToAddress(AbstractPlainSocketImpl.java:206)
    at java.net.AbstractPlainSocketImpl.connect(AbstractPlainSocketImpl.java:188)
    at java.net.SocksSocketImpl.connect(SocksSocketImpl.java:392)
    at java.net.Socket.connect(Socket.java:589)
    at java.net.Socket.connect(Socket.java:538)
    at java.net.Socket.<init>(Socket.java:434)
    at java.net.Socket.<init>(Socket.java:211)
    at sun.rmi.transport.proxy.RMIDirectSocketFactory.createSocket(RMIDirectSocketFactory.java:40)
    at sun.rmi.transport.proxy.RMIMasterSocketFactory.createSocket(RMIMasterSocketFactory.java:148)
    at sun.rmi.transport.tcp.TCPEndpoint.newSocket(TCPEndpoint.java:613)
    ... 27 more

Server has 10Gb of memory left.

root@deep-kubernetes:~# free -h
              total        used        free      shared  buff/cache   available
Mem:            51G         40G        633M         54M        9.6G         10G
Swap:            0B          0B          0B
wongelz commented 4 years ago

According to:

default         my-confluent-cp-zookeeper-0                        0/2     Pending            0          3m12s

Zookeeper isn't running. Kafka depends on zookeeper, everything else depends on kafka so they're gonna crash until zookeeper and kafka are up.

lmode commented 3 years ago

Hello, I have the same issue. However I noticed that zookeeper and kafka are down for the same reason: choose one of: [prometheus-jmx-exporter......... Any idea what is the issue with that? Thanks regards.

little-snow-fox commented 3 years ago

Recommend that you check PersistentVolume config.

OneCricketeer commented 3 years ago

down for the same reason: choose one of

That's not the reason. You cannot log all containers in the pod, you need to choose

Recommend that you check PersistentVolume config.

Could you be more specific?