robusta-dev / robusta

Better Prometheus alerts for Kubernetes - smart grouping, AI enrichment, and automatic remediation
https://home.robusta.dev/
MIT License
2.61k stars 254 forks source link

sink kafka don't work #801

Open zhangiicc opened 1 year ago

zhangiicc commented 1 year ago

Describe the bug sink kafka don't work

To Reproduce Steps to reproduce the behavior:

cat generated_values.yaml
globalConfig:
  signing_key: xxx
  account_id: xxx
sinksConfig:
- kafka_sink:
    name: kafka_sink
    kafka_url: "10.227.102.28:9092,10.227.102.29:9092,10.227.102.30:9092"
    topic: "topic-xxx"
- slack_sink:
    name: main_slack_sink
    slack_channel: robusta
    api_key: xxxx
enablePlatformPlaybooks: true
runner:
  sendAdditionalTelemetry: false
rsa:
   xxxxxx

helm install robusta robusta/robusta -f ./generated_values.yaml  -n robusta --create-namespace   --set clusterName=jq-test 

 kubectl get po -n robusta
NAME                                READY   STATUS    RESTARTS   AGE
robusta-forwarder-999548cdb-gkck8   1/1     Running   0          44m
robusta-runner-56d664ffdf-ctwxt     1/1     Running   0          44m

Expected behavior logs of robusta-runner

2023-03-30 06:46:33.090 INFO     logger initialized using INFO log level
2023-03-30 06:46:33.090 INFO     Creating hikaru monkey patches
2023-03-30 06:46:33.090 INFO     Creating yaml monkey patch
2023-03-30 06:46:33.090 INFO     Creating kubernetes ContainerImage monkey patch
2023-03-30 06:46:33.091 INFO     watching dir /etc/robusta/playbooks/ for custom playbooks changes
2023-03-30 06:46:33.094 INFO     watching dir /etc/robusta/config/active_playbooks.yaml for custom playbooks changes
2023-03-30 06:46:33.095 INFO     Reloading playbook packages due to change on initialization
2023-03-30 06:46:33.096 INFO     loading config /etc/robusta/config/active_playbooks.yaml
2023-03-30 06:46:33.241 INFO     Unknown cluster discovered.
2023-03-30 06:46:33.243 INFO     No custom playbooks defined at /etc/robusta/playbooks/storage
2023-03-30 06:46:33.243 INFO     Importing actions package robusta.core.playbooks.internal
2023-03-30 06:46:33.244 INFO     importing actions from robusta.core.playbooks.internal.discovery_events
2023-03-30 06:46:35.840 INFO     Importing actions package robusta_playbooks
2023-03-30 06:46:35.841 INFO     importing actions from robusta_playbooks.alerts_integration
2023-03-30 06:46:35.953 INFO     importing actions from robusta_playbooks.argo_cd
2023-03-30 06:46:35.991 INFO     importing actions from robusta_playbooks.autoscaler
2023-03-30 06:46:36.021 INFO     importing actions from robusta_playbooks.babysitter
2023-03-30 06:46:36.043 INFO     importing actions from robusta_playbooks.bash_enrichments
2023-03-30 06:46:36.047 INFO     importing actions from robusta_playbooks.chaos_engineering
2023-03-30 06:46:36.049 INFO     importing actions from robusta_playbooks.common_actions
2023-03-30 06:46:36.078 INFO     importing actions from robusta_playbooks.configuration_ab_testing
2023-03-30 06:46:36.109 INFO     importing actions from robusta_playbooks.cpu_throttling
2023-03-30 06:46:36.110 INFO     importing actions from robusta_playbooks.daemonsets
2023-03-30 06:46:36.111 INFO     importing actions from robusta_playbooks.deployment_enrichments
2023-03-30 06:46:36.112 INFO     importing actions from robusta_playbooks.deployment_status_report
2023-03-30 06:46:36.134 INFO     importing actions from robusta_playbooks.disk_benchmark
2023-03-30 06:46:36.154 INFO     importing actions from robusta_playbooks.event_enrichments
2023-03-30 06:46:36.170 INFO     importing actions from robusta_playbooks.git_change_audit
2023-03-30 06:46:36.199 INFO     importing actions from robusta_playbooks.grafana_enrichment
2023-03-30 06:46:36.200 INFO     importing actions from robusta_playbooks.http_actions
2023-03-30 06:46:36.223 INFO     importing actions from robusta_playbooks.image_pull_backoff_enricher
2023-03-30 06:46:36.224 INFO     importing actions from robusta_playbooks.java_pod_troubleshooting
2023-03-30 06:46:36.241 INFO     importing actions from robusta_playbooks.job_actions
2023-03-30 06:46:36.284 INFO     importing actions from robusta_playbooks.job_restart_on_oomkilled_community
2023-03-30 06:46:36.314 INFO     importing actions from robusta_playbooks.k8s_resource_enrichments
2023-03-30 06:46:36.339 INFO     importing actions from robusta_playbooks.networking
2023-03-30 06:46:36.355 INFO     importing actions from robusta_playbooks.node_cpu_analysis
2023-03-30 06:46:36.356 INFO     importing actions from robusta_playbooks.node_disk_analysis
2023-03-30 06:46:36.365 INFO     importing actions from robusta_playbooks.node_enrichments
2023-03-30 06:46:36.367 INFO     importing actions from robusta_playbooks.oom_killer
2023-03-30 06:46:36.384 INFO     importing actions from robusta_playbooks.overcommit_enrichments
2023-03-30 06:46:36.385 INFO     importing actions from robusta_playbooks.persistent_data
2023-03-30 06:46:36.388 INFO     importing actions from robusta_playbooks.persistent_volume_actions
2023-03-30 06:46:36.389 INFO     importing actions from robusta_playbooks.pod_actions
2023-03-30 06:46:36.389 INFO     importing actions from robusta_playbooks.pod_enrichments
2023-03-30 06:46:36.390 INFO     importing actions from robusta_playbooks.pod_troubleshooting
2023-03-30 06:46:36.476 INFO     importing actions from robusta_playbooks.prometheus_enrichments
2023-03-30 06:46:36.477 INFO     importing actions from robusta_playbooks.prometheus_simulation
2023-03-30 06:46:36.516 INFO     importing actions from robusta_playbooks.pvc_snapshots
2023-03-30 06:46:36.535 INFO     importing actions from robusta_playbooks.restart_loop_reporter
2023-03-30 06:46:36.567 INFO     importing actions from robusta_playbooks.silence
2023-03-30 06:46:36.741 INFO     importing actions from robusta_playbooks.simple_examples
2023-03-30 06:46:36.755 INFO     importing actions from robusta_playbooks.stress_tests
2023-03-30 06:46:36.773 INFO     importing actions from robusta_playbooks.targetdown_enrichment
2023-03-30 06:46:36.774 INFO     Adding <class 'robusta.core.sinks.webhook.webhook_sink_params.WebhookSinkConfigWrapper'> sink named webhook_sink
2023-03-30 06:46:36.774 INFO     Adding <class 'robusta.core.sinks.kafka.kafka_sink_params.KafkaSinkConfigWrapper'> sink named kafka_sink
2023-03-30 06:46:36.776 INFO     <BrokerConnection node_id=bootstrap-1 host=10.227.102.28:9092 <connecting> [IPv4 ('10.227.102.28', 9092)]>: connecting to 10.227.102.28:9092 [('10.227.102.28', 9092) IPv4]
2023-03-30 06:46:36.776 INFO     Probing node bootstrap-1 broker version
2023-03-30 06:46:36.777 INFO     <BrokerConnection node_id=bootstrap-1 host=10.227.102.28:9092 <connecting> [IPv4 ('10.227.102.28', 9092)]>: Connection complete.
2023-03-30 06:46:36.887 INFO     Broker version identified as 2.3.0
2023-03-30 06:46:36.887 INFO     Set configuration api_version=(2, 3, 0) to skip auto check_version requests on startup
2023-03-30 06:46:36.894 INFO     Adding <class 'robusta.core.sinks.slack.slack_sink_params.SlackSinkConfigWrapper'> sink named main_slack_sink
2023-03-30 06:46:37.299 INFO     Adding <class 'robusta.core.sinks.robusta.robusta_sink_params.RobustaSinkConfigWrapper'> sink named robusta_ui_sink
2023-03-30 06:46:37.635 INFO     Supabase dal login
2023-03-30 06:46:38.786 INFO     cluster status {'account_id': '260b720b-7324-4f5f-9b79-fd4d98326f50', 'cluster_id': 'jq-test', 'version': '0.10.14', 'light_actions': 30, 'ttl_hours': 4380, 'stats': {'deployments': 10, 'statefulsets': 2, 'daemonsets': 3, 'replicasets': 10, 'pods': 16, 'nodes': 1, 'jobs': 0, 'provider': 'Unknown'}, 'updated_at': 'now()'}
2023-03-30 06:46:39.779 INFO     Initializing TopServiceResolver
2023-03-30 06:46:42.011 INFO     Cluster discovery initialized
2023-03-30 06:46:42.092 INFO     created jobs states configmap scheduled-jobs robusta
2023-03-30 06:46:42.118 INFO     Loading RSA keys from /etc/robusta/auth
2023-03-30 06:46:42.122 INFO     Loaded private key file /etc/robusta/auth/prv
2023-03-30 06:46:42.123 INFO     Loaded public key file /etc/robusta/auth/pub
2023-03-30 06:46:42.123 INFO     starting relay receiver
2023-03-30 06:46:42.125 INFO     Initialized task queue: 20 workers. Max size 500
2023-03-30 06:46:42.194 INFO     Initialized task queue: 20 workers. Max size 500
 * Serving Flask app 'robusta.runner.web'
 * Debug mode: off
2023-03-30 06:46:42.596 INFO     connecting to server as account_id=260b720b-7324-4f5f-9b79-fd4d98326f50; cluster_name=jq-test
2023-03-30 06:46:43.051 INFO     Cluster already has historical data, No history pulled.
2023-03-30 06:46:43.201 INFO     cluster status {'account_id': '260b720b-7324-4f5f-9b79-fd4d98326f50', 'cluster_id': 'jq-test', 'version': '0.10.14', 'light_actions': 30, 'ttl_hours': 4380, 'stats': {'deployments': 10, 'statefulsets': 2, 'daemonsets': 3, 'replicasets': 10, 'pods': 16, 'nodes': 1, 'jobs': 0, 'provider': 'Unknown'}, 'updated_at': 'now()'}
2023-03-30 06:46:43.671 ERROR    could not fetch logs from container: crashpod. logs were None
2023-03-30 06:46:43.851 WARNING  Kafka sink unsupported enrichment types. Currently only diff/json enrichment is supported
2023-03-30 06:46:44.496 INFO     Initializing services cache
2023-03-30 06:46:45.415 INFO     Initializing nodes cache
2023-03-30 06:46:47.244 INFO     Initializing jobs cache
2023-03-30 06:46:48.175 INFO     Initializing namespaces cache
2023-03-30 06:51:36.959 INFO     <BrokerConnection node_id=2 host=10.227.102.30:9092 <connecting> [IPv4 ('10.227.102.30', 9092)]>: connecting to 10.227.102.30:9092 [('10.227.102.30', 9092) IPv4]
2023-03-30 06:51:36.971 INFO     <BrokerConnection node_id=2 host=10.227.102.30:9092 <connecting> [IPv4 ('10.227.102.30', 9092)]>: Connection complete.
2023-03-30 06:51:36.972 INFO     <BrokerConnection node_id=bootstrap-1 host=10.227.102.28:9092 <connected> [IPv4 ('10.227.102.28', 9092)]>: Closing connection.

Screenshots slack_sink is working image

Additional context

kubectl version
Client Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2", GitCommit:"faecb196815e248d3ecfb03c680a4507229c2a56", GitTreeState:"clean", BuildDate:"2021-01-13T13:28:09Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}
Server Version: version.Info{Major:"1", Minor:"20", GitVersion:"v1.20.2", GitCommit:"faecb196815e248d3ecfb03c680a4507229c2a56", GitTreeState:"clean", BuildDate:"2021-01-13T13:20:00Z", GoVersion:"go1.15.5", Compiler:"gc", Platform:"linux/amd64"}
arikalon1 commented 1 year ago

Thanks for reporting it @zhangiicc ,

Can you please describe what kind of data you would like to get ok Kafka, and how you would use it?

zhangiicc commented 1 year ago

I want to get messages as

pod name 
pod namespace
Source
waiting reason
termination reason
Container logs

but that kafka sink don't work.

arikalon1 commented 1 year ago

Thank you @zhangiicc

The Kafka sink was used to send only custom Json notifications. We'll add the option to send any notification to Kafka, as other sink.

Would you like to open a PR for this change?