Azure / azure-event-hubs-for-kafka

Azure Event Hubs for Apache Kafka Ecosystems
https://docs.microsoft.com/azure/event-hubs/event-hubs-for-kafka-ecosystem-overview
Other
230 stars 213 forks source link

filebeat with event-hub-kafka output, pulish fails: client has run out of available brokers to talk to. #158

Open ZzhKlaus opened 3 years ago

ZzhKlaus commented 3 years ago

Description

I am using Filebeat to stream my log file to azure event hub and with the config as kafka output, the connection to host ip can be established, but when it try to publish topics there is error "Kafka publish failed with: kafka: client has run out of available brokers to talk to (Is your cluster reachable?)". I am sure the setting of connection string and evenhub names, topics are correct cause I test with python script and event hub can receive the messages. When it come to Filebeat, only see requests but no message. Is it due to authentication issues? I'm not sure and this problem puzzled me several days.

How to reproduce

The filebeat.yml file looks like:

filebeat.inputs:
- type: log
  enabled: true
  paths:
    - C:\DataPipeline\DataOut\*.ndjson
  json.key_under_root: true
  json.add_error_key: true
setup.template.settings:
  index.number_of_shards: 1
output.kafka:
  topic: "ehpwt"
  required_acks: 1
  client_id: filebeat
  version: '1.0.0'
  hosts:
    - "ehnamespace.servicebus.windows.net:9093"
  username: "$ConnectionString"
  password: "(The connection string - secure)"
  ssl.enabled: true
  compression: none
  max_message_bytes: 1000000
  partition.round_robin:
    reachable_only: true
logging.level: debug
logging.to_files: true
logging.files:
  path: C:\log\filebeat
  name: filebeat
  keepfiles: 7
  permissions: 0644

and my error log shows: 2021-06-16T21:21:26.870+0200 INFO [publisher_pipeline_output] pipeline/output.go:143 Connecting to kafka(ehnspwt.servicebus.windows.net:9093) 2021-06-16T21:21:26.882+0200 DEBUG [kafka] kafka/client.go:100 connect: [ehnspwt.servicebus.windows.net:9093] 2021-06-16T21:21:26.886+0200 INFO [publisher_pipeline_output] pipeline/output.go:151 Connection to kafka(ehnspwt.servicebus.windows.net:9093) established 2021-06-16T21:21:26.891+0200 DEBUG [harvester] log/log.go:107 End of file reached: C:\DataPipeline\DataOut\test01 - Copy (4).ndjson; Backoff now. 2021-06-16T21:21:27.692+0200 DEBUG [kafka] kafka/client.go:371 finished kafka batch 2021-06-16T21:21:27.693+0200 DEBUG [kafka] kafka/client.go:385 Kafka publish failed with: kafka: client has run out of available brokers to talk to (Is your cluster reachable?) 2021-06-16T21:21:27.693+0200 INFO [publisher] pipeline/retry.go:219 retryer: send unwait signal to consumer 2021-06-16T21:21:27.701+0200 INFO [publisher] pipeline/retry.go:223 done 2021-06-16T21:21:28.496+0200 DEBUG [kafka] kafka/client.go:371 finished kafka batch 2021-06-16T21:21:28.496+0200 DEBUG [kafka] kafka/client.go:385 Kafka publish failed with: kafka: client has run out of available brokers to talk to (Is your cluster reachable?)

Has it worked previously?

Checklist ========= **IMPORTANT**: We will close issues where the checklist has not been completed or where adequate information has not been provided. Please provide the relevant information for the following items: - [ ] SDK (include version info): `filebeat-7.12.0-windows-x86_64` - [ ] Sample you're having trouble with: `attached config code` - [x] If using Apache Kafka Java clients or a framework that uses Apache Kafka Java clients, version: `` - [ ] Kafka client configuration: `` (*do not include your connection string or SAS Key*) - [ ] Namespace and EventHub/topic name - [ ] Consumer or producer failure `` - [ ] Timestamps *in UTC* `<2021-06-16T21:21:27.693+0200>` - [ ] group.id or client.id `` - [ ] Logs provided (with debug-level logging enabled if possible, e.g. log4j.rootLogger=DEBUG) or exception call stack - [ ] Standalone repro `` - [ ] Operating system: `` - [ ] Critical issue If this is a question on basic functionality, please verify the following: - [ ] Port 9093 should not be blocked by firewall ("broker cannot be found" errors) - [x] Pinging FQDN should return cluster DNS resolution (e.g. `$ ping namespace.servicebus.windows.net` returns ~ `ns-eh2-prod-am3-516.cloudapp.net [13.69.64.0]`) - [x] Namespace should be either Standard or Dedicated tier, not Basic (TopicAuthorization errors) Updates: I found the "proxy_url: 'http://127.0.0.1:3128' " setting within output.kafka which I added before is not correct, it's not documented in official docs https://www.elastic.co/guide/en/beats/filebeat/current/kafka-output.html, but I do need to set proxy on my network. I tried the same setting with another laptop which connect directly to internet without proxy and it works, so it's still a proxy problem. Where can I set the similar setting as "proxy_url: 'http://127.0.0.1:3128" in Filebeat with kafka output?
ernani commented 3 years ago

You might need to use a squid proxy or some sort of proxy within your network to bridge such connections, once able you can update the env vars for the user that filebeat uses to actually set either HTTP_PROXY or HTTPS_PROXY env vars with that hostname pointing to your squid/proxy ip address.