fluent / helm-charts

Helm Charts for Fluentd and Fluent Bit
Apache License 2.0
365 stars 439 forks source link

err 12 timeout while contacting dns servers #264

Open AshutoshNirkhe opened 1 year ago

AshutoshNirkhe commented 1 year ago

This is basically same as issue #4050 but still happening at v1.8.15 (at least for using Azure Log Analytics as output). Opening a separate one as per comment from PettitWesley https://github.com/fluent/fluent-bit/issues/4050#issuecomment-1113661151

[2022/10/06 09:20:31] [ warn] [net] getaddrinfo(host='2eee2d30-f495-4b6c-ab77-5e73f5400bad.ods.opinsights.azure.com', err=12): Timeout while contacting DNS servers
[2022/10/06 09:20:31] [ warn] [net] getaddrinfo(host='2eee2d30-f495-4b6c-ab77-5e73f5400bad.ods.opinsights.azure.com', err=12): Timeout while contacting DNS servers
[2022/10/06 09:20:31] [ warn] [net] getaddrinfo(host='2eee2d30-f495-4b6c-ab77-5e73f5400bad.ods.opinsights.azure.com', err=12): Timeout while contacting DNS servers
[2022/10/06 09:20:31] [ warn] [net] getaddrinfo(host='2eee2d30-f495-4b6c-ab77-5e73f5400bad.ods.opinsights.azure.com', err=12): Timeout while contacting DNS servers
[2022/10/06 09:20:31] [ warn] [net] getaddrinfo(host='2eee2d30-f495-4b6c-ab77-5e73f5400bad.ods.opinsights.azure.com', err=12): Timeout while contacting DNS servers
PettitWesley commented 1 year ago

We had seem sporadic reports of DNS issues in the 1.8 series, with the latest 1.9 series images we have not seen anymore reports. Please use the AWS distro and use 2.29.0: https://github.com/aws/aws-for-fluent-bit/releases

divbell commented 1 year ago

I can confirm this is still occurring with 2.30.0.

[2023/01/26 14:46:50] [ warn] [net] getaddrinfo(host='logs.us-east-1.amazonaws.com', err=12): Timeout while contacting DNS servers

PettitWesley commented 1 year ago

Please try: https://github.com/aws/aws-for-fluent-bit/blob/mainline/troubleshooting/debugging.md#dns-resolution-issues

oblak-be commented 1 month ago

Since a few days we experience a similar issue, only with Fluent-bit

[ warn] [net] getaddrinfo(host='blabla.oblak.be', err=11): Could not contact DNS servers

When I do this same DNS query from the host and/or another pod in the same cluster, even when explicitly using coredns, that works like a charm.

Should be noted that the error code (11) is different from the poster (12), but I didn't want to make an extra unnecessary issue..

I tested with fluent-bit version 1.8.4 , 1.9 and 2.0 and they all have the same issue.

I still suspect we made a config mistake somewhere but as of now I have no clue where to look further..