fluent / fluent-bit

Fast and Lightweight Logs and Metrics processor for Linux, BSD, OSX and Windows
https://fluentbit.io
Apache License 2.0
5.7k stars 1.55k forks source link

Fluentbit starting with version 2.2.1 resolve to wrong IP address #8951

Closed salacr closed 2 weeks ago

salacr commented 2 months ago

Bug Report

Describe the bug I have deployed fluent-bit (2.2.1 / 2.2.2 / 3.0.6 I tried all this versions and must downgrade to 2.2.0) in docker swarm (Docker version 20.10.16, build aa7e414) on Debian GNU/Linux 10 (buster) My output section looks like this:

    outputs:
        - name:  stdout
          match: '*'
        - name:  loki
          match: '*'
          host:  loki

But the fluentbit isn't able to send the data to loki because of error:

[2024/06/12 22:23:44] [error] [upstream] connection #127 to tcp://PUBLIC-IP-OF-THE-SERVER:3100 timed out after 10 seconds (connection timeout)
[2024/06/12 22:23:44] [debug] [net] TCP connection timed out: loki:3100
[2024/06/12 22:23:44] [debug] [net] could not connect to loki:3100

As we can see it tries to connect to PUBLIC ip instead of "inner IP" which is provided to loki service (which is running in same swarm stack) when I deploy the debug image and try the curl

I have no name!@d4051900c31b:/$ curl -v loki
*   Trying 192.210.1.85:80...

As we can see curl is corectly resolved the loki hostname to proper IP address. I tried deploy into simple docker-compose project here it works corectly maybe it's bount to swarm deployment?

To Reproduce Deploy fluent-bit in swarm deployment and trie to connect to loki in same stack

Expected behavior fluent-bit should resolve correct IP address and send data to loki

Your Environment

salacr commented 2 months ago

I can see that there was change in c-ares in 2.2.1 so it's most likely related to this..

edsiper commented 2 months ago

We are upgrading to v1.30.0: https://github.com/fluent/fluent-bit/pull/8953

edsiper commented 2 months ago

@salacr would you mind to check if the latest changes in GIT master fixes the issue?

salacr commented 2 months ago

Sure, is there some docker image / artefact which I can try (I looked on dockerhub and didn't find any)? Or I must build it myself?

EDIT: I see that it should be pretty easy to build a docker image so I will give it a shot

salacr commented 2 months ago

Ok I have found that you are publishing master version on ghcr. Right now it's failing because of:

[2024/06/14 07:24:24] [error] [config] YAML error found in file "/fluent-bit/etc/fluent-bit.yaml", line 22, column 27: unexpected event 6 in state 19.
[2024/06/14 07:24:24] [error] configuration file contains errors, aborting.

version 3.0.7 is able to start with same config I will try to find out what's the issue...

salacr commented 2 months ago

It seams there is some other issue as even this simple yaml file:

pipeline:
  inputs:
    - name: dummy
      dummy: '{"message": "custom dummy"}'
  outputs:
    - name: stdout
      match: '*'

produces an error:

docker run -it test
Fluent Bit v1.9.4 | NIGHTLY_BUILD=master - DO NOT USE IN PRODUCTION!
* Copyright (C) 2015-2022 The Fluent Bit Authors
* Fluent Bit is a CNCF sub-project under the umbrella of Fluentd
* https://fluentbit.io

[2024/06/14 05:38:00] [error] [config] YAML error found in file "/fluent-bit/etc/fluent-bit.yaml", line 3, column 12: unexpected event 6 in state 19.
[2024/06/14 05:38:00] [error] configuration file contains errors, aborting.
salacr commented 2 months ago

But when I use legacy .conf format it's working! So I guess that we can close this issue as solved. Do you need a separate issue for the problems with yaml config ?

edsiper commented 2 weeks ago

thanks, let's use a new ticket for a separate issue.