strimzi / strimzi-kafka-operator

Apache Kafka® running on Kubernetes
https://strimzi.io/
Apache License 2.0
4.83k stars 1.29k forks source link

[Question] Running Cruise Control UI #3117

Closed krystyw closed 4 years ago

krystyw commented 4 years ago

Hi,

I've started using strimzi for deploying Kafka and I've decided to deploy Cruise Control along with it. Lately I wanted to use Cruise Control UI for easier (and prettier) way to check Kafka's status. Unfortunately, I can't figure out a way to do that.

I saw your blog post about it and in that version, the CCFE was deployed with the Cruise Control. In strimzi's version of the image the CCFE isn't there.

I've tried to run CCFE locally on my machine and port-forward to the Cruise Control pod, however in that case the CORS blocks me. Changing the config of CC, in order to allow CORS, isn't allowed by the operator.

My question is: is there a way to use/deploy CCFE with strimzi's Cruise Control deployment?

scholzj commented 4 years ago

Our idea is for users to use the KafkaRebalance resource for controlling Cruise Cotnrol. Not the UI or the REST API.

Not sure if anyone from the people who worked on the Cruise Cotnrol support have any tips or tricks how you can use it.

scholzj commented 4 years ago

CC: @ppatierno @tomncooper @kyguy

ppatierno commented 4 years ago

To be honest I haven't tried it but from what I can see on the CCFE GitHub repo, you can create a Dockerfile with it (https://github.com/linkedin/cruise-control-ui/wiki/CCFE-(Dev-Mode)---Docker) which could be used for your CCFE Deployment as for the container image to run CCFE itself alongside the Strimzi Kafka cluster with CC. At that point, you should configure CCFE to point to the <cluster-name>-cruise-control Kubernetes service which exposes the CC REST API on port 9090. I am not a NodeJS expert and the only place in the CCFE code where the target CC is set seems to be this https://github.com/linkedin/cruise-control-ui/blob/master/config/index.js#L35 This is where you should put the CC Kubernetes service. It's not a huge help but at least a starting point for investigation :-)

ppatierno commented 4 years ago

Actually I found this https://github.com/linkedin/cruise-control-ui/wiki/config.csv-Syntax If this config.csv is used for configuring CCFE so even the CC URL, you could set it into a ConfigMap and mounting it as a volume in the CCFE Deployment. It would be the way to configure and change the configuration for CCFE if needed.

kyguy commented 4 years ago

Hey @krystyw!

If you are open to creating your own Kafka image you could also:

Add the following line here [1]

RUN curl -L https://github.com/linkedin/cruise-control-ui/releases/download/v0.3.4/cruise-control-ui-0.3.4.tar.gz -o /tmp/cruise-control-ui.tar.gz \
+    && tar zxvf /tmp/cruise-control-ui.tar.gz -C ${CRUISE_CONTROL_HOME}

Build the Kafka image and push it to your K8s cluster (remember to reference your custom Kafka image from your Cluster Operator deployment)

Then in your Kafka resource:

apiVersion: kafka.strimzi.io/v1beta1
kind: Kafka
metadata:
  ...
spec:
  cruiseControl:
    config:
      webserver.ui.diskpath: /opt/cruise-control/cruise-control-ui/dist/

then port-forward the pod

kubectl port-forward svc/my-cluster-cruise-control 9090:9090

This will make the UI accessible via http://127.0.0.1:9090

[1] https://github.com/strimzi/strimzi-kafka-operator/blob/0.18.0/docker-images/kafka/Dockerfile#L43

krystyw commented 4 years ago

Thank you for the answers!

@scholzj Thank you for the reasoning behind not including the CCFE. Right now, we are interested in easy status checking, not the rebalancing feature and that's why we thought about CCFE in the first place.

@ppatierno Using separate pod for the CCFE would result in CORS error from CC as well, I think. I didn't try that, but I'd guess that would be the same as me running the CCFE locally. I'll try it though. The configuration file I've already found before.

@kyguy I wanted to avoid building my own image but I guess that would be a sure solution for the problem.

spaghettifunk commented 4 years ago

I just want to share the work I did based on @kyguy answer. You can have your own image by simply building a multi-stage one based on the strimzi version. What we did was

# --------------- Builder stage ---------------
FROM centos:7 AS builder

ENV CC_VERSION=0.3.4
ENV CRUISE_CONTROL_HOME=/opt/cruise-control

RUN mkdir $CRUISE_CONTROL_HOME

# Install Cruise Control GUI Frontend
RUN curl -L https://github.com/linkedin/cruise-control-ui/releases/download/v${CC_VERSION}/cruise-control-ui-${CC_VERSION}.tar.gz \
    -o /tmp/cruise-control-ui.tar.gz && \
    tar zxvf /tmp/cruise-control-ui.tar.gz -C ${CRUISE_CONTROL_HOME}

# --------------- Final stage ---------------
FROM strimzi/kafka:0.18.0-kafka-2.5.0
COPY --from=builder /opt/cruise-control/* /opt/cruise-control

We pushed this to our registry and then simply changed the Kafka object in two places

apiVersion: kafka.strimzi.io/v1beta1
kind: Kafka
metadata:
  name: kafka
  namespace: kafka
spec:
  kafka:
    replicas: 3
    image: own-repository/kafka:2.5.0
   # ...
   # ... other settings here
   # ...
  cruiseControl:
    image: own-repository/kafka:2.5.0
    config:
      webserver.ui.diskpath: /opt/cruise-control/dist/

Last touch was to create a new service that would expose a LB. We needed this as we wanted to have it available within the company. This is the result

apiVersion: v1
kind: Service
metadata:
  annotations:
    service.beta.kubernetes.io/aws-load-balancer-internal: "0.0.0.0/0"
    service.beta.kubernetes.io/aws-load-balancer-ssl-ports: "443"
    service.beta.kubernetes.io/aws-load-balancer-ssl-cert: "arn:aws:acm:eu-west-1:xxxx:certificate/xxxxx"
    service.beta.kubernetes.io/aws-load-balancer-backend-protocol: "http"
    external-dns.alpha.kubernetes.io/hostname: kafka-cruise-control.my-domain.com.
    external-dns.alpha.kubernetes.io/ttl: "60"
  labels:
    strimzi.io/cluster: kafka
    strimzi.io/kind: Kafka
    strimzi.io/name: kafka-cruise-control
  name: cruise-control-lb
  namespace: kafka
spec:
  ports:
    - name: http-9090
      port: 443
      protocol: TCP
      targetPort: 9090
  type: LoadBalancer
  selector:
    strimzi.io/cluster: kafka
    strimzi.io/kind: Kafka
    strimzi.io/name: kafka-cruise-control

This Kafka operator is absolutely amazing! Thanks for the hardwork 🚀

superbeer commented 4 years ago

extend from above comment

 config:
      webserver.ui.diskpath: /opt/cruise-control/dist/
      webserver.ui.urlprefix: /*
Ultrafenrir commented 3 years ago

hello guys, is there any chance that u will provide ability to deploy CCFE with strimzi CRD ?

kyguy commented 3 years ago

Hey @Ultrafenrir, at this time we don't have any plans to deploy the CCFE with the Strimzi CRD. However with the CORS configuration now enabled it should be a lot easier to run the CCFE in a separate pod, From what I remember there were a couple of users on the slack channel that were able to get this working without spinning custom Kafka images

Ultrafenrir commented 3 years ago

Maybe someone face this issue also, i cannot expose cruise-control 9090 port via kubernetes ingress or service . All works perfectly when i use port-forward, but as far as i try to use ClusterIP/NodePort service or ingress all connections fail. I see that endpoint is present, and if i try to reach it from the cruise-control pod everything is fine. But for example ingress cannot reach cruise-control pod ip address on port 9090. Any help very appreciated .

kyguy commented 3 years ago

Hey @Ultrafenrir, by default, Cruise Control can only be accessed externally by the Strimzi operator. You will need to create another NetworkPolicy that allows Cruise Control to be accessed by other components/services. That should do the trick!

xgengsjc2021 commented 2 years ago

@kyguy Hi, I tried to build my own kafka image based what your suggestion. But, I got failure since it cannot find the strimzi/base image.

docker build -t kafka:1.0 .
[+] Building 1.9s (4/4) FINISHED
 => [internal] load build definition from Dockerfile                                                                                                                                                                                    0.1s
 => => transferring dockerfile: 5.12kB                                                                                                                                                                                                  0.0s
 => [internal] load .dockerignore                                                                                                                                                                                                       0.1s
 => => transferring context: 2B                                                                                                                                                                                                         0.0s
 => ERROR [internal] load metadata for docker.io/strimzi/base:latest                                                                                                                                                                    1.7s
 => [auth] strimzi/base:pull token for registry-1.docker.io                                                                                                                                                                             0.0s
------
 > [internal] load metadata for docker.io/strimzi/base:latest:
------
failed to solve with frontend dockerfile.v0: failed to create LLB definition: pull access denied, repository does not exist or may require authorization: server message: insufficient_scope: authorization failed

The dockerfile I am using is this one, https://github.com/strimzi/strimzi-kafka-operator/blob/main/docker-images/kafka-based/kafka/Dockerfile

kyguy commented 2 years ago

Hi @xgengsjc2021, It will be easier to build on top of an existing Kafka image using the example Dockerfile provided from spaghettifunk in the comment above [1] I don't think the strimzi/base image is available in any public registry, it would need to be built locally to be used.

[1] https://github.com/strimzi/strimzi-kafka-operator/issues/3117#issuecomment-638688246

xgengsjc2021 commented 2 years ago

@kyguy I got another problem here, when I tried port-forward

This page isn’t working127.0.0.1 sent an invalid response.
ERR_INVALID_HTTP_RESPONSE

I created the docker image with cruise-control ui without problme. Kafka cluster with Zookeepr is also fine. For the network policy, I also added a new one to allow all of ingress traffic to the pod, my-kafka-crd-cruise-control.

  ingress:
    - ports:
        - protocol: TCP
          port: 9090

At this moment, I just want to use the existing service, (port-forward) to access the UI. So, I keep the original service unchanged, (did not create ingress or LB)

kubectl -n kafka get svc
NAME                                                    TYPE        CLUSTER-IP       EXTERNAL-IP   PORT(S)                               AGE
my-kafka-crd-cruise-control             ClusterIP   172.20.146.213   <none>        9090/TCP                              3h47m

And then do port-forward, got error above.

From the log of pod, my-kafka-crd-cruise-control-5f59978676-cm7hj,

2022-05-25 18:26:15 INFO  CruiseControlPublicAccessLogger:62 - 127.0.0.1 - user [25/May/2022:18:26:15 +0000] "GET /kafkacruisecontrol/state HTTP/1.1" 200 1555
2022-05-25 18:26:19 INFO  UserTaskManager:305 - Expiring the session associated with SessionKey{_httpSession=Session@119285b{id=node011supqjrrmif21djp9oc6eohc7357,x=node011supqjrrmif21djp9oc6eohc7357.node0,req=0,res=true},_requestUrl=GET /kafkacruisecontrol/state,_queryParams={}}.
2022-05-25 18:26:19 INFO  UserTaskManager:305 - Expiring the session associated with SessionKey{_httpSession=Session@53b609c2{id=node0kadw6ebnpbhm18xs31ta50a1b356,x=node0kadw6ebnpbhm18xs31ta50a1b356.node0,req=0,res=true},_requestUrl=GET /kafkacruisecontrol/state,_queryParams={}}.
2022-05-25 18:26:19 INFO  UserTaskManager:353 - UserTask 5f803f83-f02e-47a1-a2d3-337e9e531e5e is completed and removed from active tasks list
2022-05-25 18:26:19 INFO  UserTaskManager:353 - UserTask c93f5c15-0187-4acc-8916-cec0fc13a411 is completed and removed from active tasks list
2022-05-25 18:26:19 INFO  operationLogger:740 - Task [5f803f83-f02e-47a1-a2d3-337e9e531e5e] calculation finishes, result:
MonitorState: {state: RUNNING(20.000% trained), NumValidWindows: (1/1) (100.000%), NumValidPartitions: 117/117 (100.000%), flawedPartitions: 0}
ExecutorState: {state: NO_TASK_IN_PROGRESS}
AnalyzerState: {isProposalReady: true, readyGoals: [NetworkInboundUsageDistributionGoal, PreferredLeaderElectionGoal, CpuUsageDistributionGoal, PotentialNwOutGoal, LeaderReplicaDistributionGoal, NetworkInboundCapacityGoal, LeaderBytesInDistributionGoal, DiskCapacityGoal, ReplicaDistributionGoal, RackAwareGoal, MinTopicLeadersPerBrokerGoal, TopicReplicaDistributionGoal, NetworkOutboundCapacityGoal, CpuCapacityGoal, DiskUsageDistributionGoal, NetworkOutboundUsageDistributionGoal, ReplicaCapacityGoal]}
AnomalyDetectorState: {selfHealingEnabled:[], selfHealingDisabled:[DISK_FAILURE, BROKER_FAILURE, GOAL_VIOLATION, METRIC_ANOMALY, TOPIC_ANOMALY, MAINTENANCE_EVENT], selfHealingEnabledRatio:{DISK_FAILURE=0.0, BROKER_FAILURE=0.0, GOAL_VIOLATION=0.0, METRIC_ANOMALY=0.0, TOPIC_ANOMALY=0.0, MAINTENANCE_EVENT=0.0}, recentGoalViolations:[], recentBrokerFailures:[], recentMetricAnomalies:[], recentDiskFailures:[], recentTopicAnomalies:[], recentMaintenanceEvents:[], metrics:{meanTimeBetweenAnomalies:{GOAL_VIOLATION:0.00 milliseconds, BROKER_FAILURE:0.00 milliseconds, METRIC_ANOMALY:0.00 milliseconds, DISK_FAILURE:0.00 milliseconds, TOPIC_ANOMALY:0.00 milliseconds}, meanTimeToStartFix:0.00 milliseconds, numSelfHealingStarted:0, numSelfHealingFailedToStart:0, ongoingAnomalyDuration=0.00 milliseconds}, ongoingSelfHealingAnomaly:None, balancednessScore:100.000}

2022-05-25 18:26:19 INFO  operationLogger:740 - Task [c93f5c15-0187-4acc-8916-cec0fc13a411] calculation finishes, result:
MonitorState: {state: RUNNING(20.000% trained), NumValidWindows: (1/1) (100.000%), NumValidPartitions: 117/117 (100.000%), flawedPartitions: 0}
kyguy commented 2 years ago

Hi @xgengsjc2021, hit me up on the slack channel [1] and we can hack this together!

[1] https://cloud-native.slack.com/ssb/

kyguy commented 1 year ago

FYI: Have written up a post on the topic to help people get started [1]

[1] https://strimzi.io/blog/2023/01/11/hacking-for-cruise-control-ui/

domenicbove commented 9 months ago

Hi @kyguy just read your blog! Thanks for that!

Any chance strimzi will just bundle the cruisecontrol ui with the kafka image? it looks like zk and cruisecontrol (without the ui) is already in there. This would simplify my operations!

kyguy commented 9 months ago

Any chance strimzi will just bundle the cruisecontrol ui with the kafka image?

Hi @domenicbove there are no plans for that at this time. As of right now, we don't have the capacity or expertise to maintain it within the Strimzi project. Maybe someone could provide a separate image and install files following METHOD 2 of the blog post [1]

[1] https://strimzi.io/blog/2023/01/11/hacking-for-cruise-control-ui/

andreyolv commented 5 months ago

I have the same error as @xgengsjc2021 when port-porward

This page isn’t working127.0.0.1 sent an invalid response. ERR_INVALID_HTTP_RESPONSE

I followed the blog method one and deploying the custom Cruise Control pod I disabled generateNetworkPolicy: false at values helm chart

Any guidance?

goolzerg commented 3 months ago

same issue as @andreyolv