fgeller / kt

Kafka command line tool that likes JSON
MIT License
950 stars 100 forks source link

unable to consume topic, dns error #104

Closed ThisIsMissEm closed 4 years ago

ThisIsMissEm commented 4 years ago

Hi! When running kt topic the command connects successfully and prints out the topics on my kafka cluster, but when doing kt consume -topic my-topic I get the following error:

sarama client configuration &sarama.Config{Net:struct { MaxOpenRequests int; DialTimeout time.Duration; ReadTimeout time.Duration; WriteTimeout time.Duration; TLS struct { Enable bool; Config *tls.Config }; SASL struct { Enable bool; Handshake bool; User string; Password string }; KeepAlive time.Duration }{MaxOpenRequests:5, DialTimeout:30000000000, ReadTimeout:30000000000, WriteTimeout:30000000000, TLS:struct { Enable bool; Config *tls.Config }{Enable:false, Config:(*tls.Config)(nil)}, SASL:struct { Enable bool; Handshake bool; User string; Password string }{Enable:false, Handshake:true, User:"", Password:""}, KeepAlive:0}, Metadata:struct { Retry struct { Max int; Backoff time.Duration }; RefreshFrequency time.Duration }{Retry:struct { Max int; Backoff time.Duration }{Max:3, Backoff:250000000}, RefreshFrequency:600000000000}, Producer:struct { MaxMessageBytes int; RequiredAcks sarama.RequiredAcks; Timeout time.Duration; Compression sarama.CompressionCodec; Partitioner sarama.PartitionerConstructor; Return struct { Successes bool; Errors bool }; Flush struct { Bytes int; Messages int; Frequency time.Duration; MaxMessages int }; Retry struct { Max int; Backoff time.Duration } }{MaxMessageBytes:1000000, RequiredAcks:1, Timeout:10000000000, Compression:0, Partitioner:(sarama.PartitionerConstructor)(0x1256d60), Return:struct { Successes bool; Errors bool }{Successes:false, Errors:true}, Flush:struct { Bytes int; Messages int; Frequency time.Duration; MaxMessages int }{Bytes:0, Messages:0, Frequency:0, MaxMessages:0}, Retry:struct { Max int; Backoff time.Duration }{Max:3, Backoff:100000000}}, Consumer:struct { Retry struct { Backoff time.Duration }; Fetch struct { Min int32; Default int32; Max int32 }; MaxWaitTime time.Duration; MaxProcessingTime time.Duration; Return struct { Errors bool }; Offsets struct { CommitInterval time.Duration; Initial int64; Retention time.Duration } }{Retry:struct { Backoff time.Duration }{Backoff:2000000000}, Fetch:struct { Min int32; Default int32; Max int32 }{Min:1, Default:32768, Max:0}, MaxWaitTime:250000000, MaxProcessingTime:100000000, Return:struct { Errors bool }{Errors:false}, Offsets:struct { CommitInterval time.Duration; Initial int64; Retention time.Duration }{CommitInterval:1000000000, Initial:-1, Retention:0}}, ClientID:"kt-consume-emeliasmith", ChannelBufferSize:256, Version:sarama.KafkaVersion{version:[4]uint{0x0, 0xa, 0x0, 0x0}}, MetricRegistry:(*metrics.StandardRegistry)(0xc42004c890)}
2019/08/08 13:11:17 Initializing new client
2019/08/08 13:11:17 client/metadata fetching metadata for all topics from broker localhost:9092
2019/08/08 13:11:17 Connected to broker at localhost:9092 (unregistered)
2019/08/08 13:11:17 client/brokers registered new broker #1 at kafka.company.local:9092
2019/08/08 13:11:17 Successfully initialized new client
2019/08/08 13:11:17 Failed to connect to broker kafka.company.local:9092: dial tcp: lookup kafka.quandoo.local on 192.168.2.1:53: no such host
2019/08/08 13:11:17 client/metadata fetching metadata for [my-topic] from broker localhost:9092
2019/08/08 13:11:17 Failed to connect to broker kafka.company.local:9092: dial tcp: lookup kafka.company.local on 192.168.2.1:53: no such host
Failed to read start offset for partition 0 err=dial tcp: lookup kafka.company.local on 192.168.2.1:53: no such host

I've no idea what's causing this as if I can list topics, surely I can consume them? My cluster has no authentication requirements.

fgeller commented 4 years ago

hi @ThisIsMissEm -- sorry for the delay, i've only recently resumed work on my github projects. reading through the logs it looks like you can connect to the broker that's on your localhost, but not a "remote" one at 'kafka.company.local' - correct? if you're trying to connect to a bigger cluster it might be worth checking with your kafka admin? the issue is a networking one though from what i can tell - e.g. maybe the advertised hostname or ip is incorrect? i hope you found a solution by now though! i'm closing this issue as I don't think it's a kt issue, but please feel free to re-open the issue if i can be of help!

peterjanes commented 1 year ago

Sorry to revive an old issue, but I'm having a similar problem and I think I've found the root cause. In -verbose mode I see an attempt to look up the hostname on my home router (192.168.1.1) but not on the one(s) provided by the VPN (172.16.130.109, 172.16.130.110); if I use the IP address of the broker(s) then everything works fine.

The net package docs say that the pure Go resolver "sends DNS requests directly to the servers listed in /etc/resolv.conf", i.e. if that doesn't include the VPN servers then it won't be able to look them up. There's an alternative implementation called cgo but unfortunately the netgo build tag prevents me from overriding it — "The decision can also be forced while building the Go source tree by setting the netgo or netcgo build tag" — and I've confirmed that myself:

$ GODEBUG=netdns=cgo+2 kt consume --brokers kafka.example.tech:9092 -topic foo -offsets newest -verbose
...
go package net: built with netgo build tag; using Go's DNS resolver
go package net: hostLookupOrder(kafka.example.tech) = files,dns

Unless netgo can be removed somehow (I don't know the intricacies of Go builds or why it was chosen) I think these are the only workarounds: