grafana / loki

Like Prometheus, but for logs.
https://grafana.com/loki
GNU Affero General Public License v3.0
23.72k stars 3.42k forks source link

Unrecognized input header error with Docker containers #5950

Closed lukaszczerpak closed 1 year ago

lukaszczerpak commented 2 years ago

Describe the bug Getting Unrecognized input header: 50 error on simple docker container. It seems to be similar to https://github.com/grafana/loki/issues/5856

To Reproduce Steps to reproduce the behavior:

  1. Started Loki 2.5.0
  2. Started Promtail 2.5.0 to tail docker container logs
  3. Run a new container that echoes something to stdout:
    $ docker run --rm \
      --log-driver local --log-opt max-size=10m \
      alpine echo hello world
  4. Observe errors in journald/syslog.

Expected behavior Receive logs from Docker and push to Loki service.

Environment:

Promtail config:

...
scrape_configs:
  - job_name: docker
    docker_sd_configs:
      - host: unix:///var/run/docker.sock
        refresh_interval: 5s
        filters:
          - name: name
            values: [loggingtesting]
    relabel_configs:
      - source_labels: ['__meta_docker_container_name']
        regex: '/(.*)'
        target_label: 'container'

Error:

Apr 18 20:36:24 server promtail[462732]: level=warn ts=2022-04-18T18:36:24.283252527Z caller=target.go:120 target=docker/d127e5083600a4618b773825eac65f8b260f3605ba0ed33b43cabb66a60a39ed msg="could not transfer logs" written=0 container=d127e5083600a4618b773825eac65f8b260f3605ba0ed33b43cabb66a60a39ed err="Unrecognized input header: 50"

I am happy to do more testing or provide with any details needed.

sven-n commented 2 years ago

I've got the same issue running on Windows with Docker Desktop. After an update to the current version, the issue was gone :)

Edit: Celebrated too early, the issue is back :-(

dorinand commented 2 years ago

Hi, did you solve it somehow? I am running Loki 2.5.0 and Promtail 2.5.0. Currently I am running 3 nodes of docker swarm and register the same issue:

2022-06-27T23:46:23.545893275Z level=warn ts=2022-06-27T23:46:23.545745804Z caller=target.go:120 target=docker/9eea7bb6af3159872dd6c114c8d68b3d1227709c2132c48ed73f2f9aeefe7f3c msg="could not transfer logs" written=0 container=9eea7bb6af3159872dd6c114c8d68b3d1227709c2132c48ed73f2f9aeefe7f3c err="Unrecognized input header: 50"

From Loki,based on my query |= "Unrecognized input header: 50" I can see there is exactly 10 errors every minute: image And from ad hoc statistics I can see, there is exactly 5 containers causing this issue with exact same percentage number of the error count: image

timchenko-a commented 2 years ago

Faced the same issue, but using Logging Driver: json-file As a workaround switched promtail to read logs directly:

    static_configs:
      - targets:
          - localhost
        labels:
          __path__: /var/lib/docker/containers/*/*-json.log
dorinand commented 2 years ago

@timchenko-a do you mean configuration of docker driver? I already have this configuration:

{
  "storage-driver": "overlay2",
  "log-driver": "json-file",
  "log-opts": {
    "max-size": "5m",
    "max-file": "5"
  }
}

From container LogPath, I can determine that all logs are valid JSON. Verified with:

cat /var/lib/docker/containers/882dd058826c73ecd391dfc2e989b593a9c2b5202d85a7009dd02ce42bfddfd7/882dd058826c73ecd391dfc2e989b593a9c2b5202d85a7009dd02ce42bfddfd7-json.log | jq .
shidarin commented 2 years ago

Did anyone end up with a fix or workaround to this? Hitting a fair number of my containers.

I'd take my hand at it, but I'm a complete Go noob and while I can see the problem and the description in #5856 makes sense, I'm unsure how to go about hacking this up so that an ASCII 0,1, 2 are also acceptable.

jeschkies commented 2 years ago

:wave: this seems to be a duplicate of #5856 indeed. ~What is the version of you Docker daemon?~ The issue is with the Docker client used. Let me see if there's a fix upstream.

@lukaszczerpak are you running the Docker container with TTY by any chance?

shidarin commented 2 years ago

For me the docker version is a mix of 20.10.17 & 20.10.7

are you running the Docker container with TTY by any chance?

Not Luka, but this is the ticket. For me there's a 1:1 correlation between promtail "50" failures and containers with TTY. Containers with console set to None work with promtail.

dorinand commented 2 years ago

Hi @jeschkies , as I wrote in previous comment, I am encountering the same issue. I check it and my problematic applications is running with tty: true . I tested it locally and it really cause this issue. Is there any workaround?

shidarin commented 2 years ago

Is there any workaround?

Most containers do not need tty or interactive mode- it's there for demo purposes during container setup. The only containers that need it are those that absolutely must think they're running live in a terminal. Simply remove tty: true from your docker compose and test the container. If it works, you're good to go.

jeschkies commented 2 years ago

@shidarin and @dorinand thanks for clarifying. We nailed it down now to Docker client ContainerLogs not mixing well with stdcopy.StdCopy from the Golang Docker package.

codengine commented 2 years ago

I've encountered the same issue as described here. The application causing the issue specifically is "mailcow", to be more precise the containers of "mailcow-unbound" and "mailcow-dovecot" which both have tty=true.

When I set tty=false the warnings don't occur anymore but the application also does not work correctly. A solution would be more then welcome.

glermaidt commented 2 years ago

Did you install the Promtail Docker plugin? That seemed to fix it for me. docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissions

Nkmol commented 1 year ago

Has any priority been given to this issue? As a NodeJS company this currently completely blocks us from using Loki through SD.

Is docker_sd seen as experimental? Should we target the files instead? We initially picked SD due to the issues described in #2361 (and because its simplicity in set up).

@jeschkies maybe you can share some guidance on this part?

youngpabl0 commented 1 year ago

++ i trying add 3 lines in stdcopy.go but it going in looping container logs, i switched to logstash client to make container logs nicely visible in loki lines (vendor/github.com/docker/docker/pkg/stdcopy.go[120-122]:

        if buf[stdWriterFdIndex] >= 48 {
            buf[stdWriterFdIndex] -= 48 // hack for ASCII symbols in stdWriterFdIndex
        }

please tell how to fix it, i will really appriciate it, thank u so much! logs after stdcopy modifications:

Nov 30 17:59:56 debian promtail[6299]: level=debug ts=2022-11-30T14:59:56.770820777Z caller=target.go:221 target=docker/781595b08265aaff51e71cfb3587b645f740d0325c047c7caf3753eface8731d msg="starting process loop" container=781595b08265aaff51e71cfb3587b645f740d0325c047c7caf3753eface8731d
Nov 30 17:59:56 debian promtail[6299]: level=info ts=2022-11-30T14:59:56.770875813Z caller=target_group.go:95 msg="added Docker target" containerID=781595b08265aaff51e71cfb3587b645f740d0325c047c7caf3753eface8731d
Nov 30 18:00:06 debian promtail[6299]: level=debug ts=2022-11-30T15:00:06.772387263Z caller=target_group.go:76 msg="container target already exists" container=781595b08265aaff51e71cfb3587b645f740d0325c047c7caf3753eface8731d
Nov 30 18:00:06 debian promtail[6299]: level=debug ts=2022-11-30T15:00:06.77241605Z caller=target.go:226 target=docker/781595b08265aaff51e71cfb3587b645f740d0325c047c7caf3753eface8731d msg="attempted to start process loop but it's already running" container=781595b08265aaff51e71cfb3587b645f740d0325c047c7caf3753eface8731d
dorinand commented 1 year ago

@youngpabl0 Hi, did you tried solution mentioned above - removing tty: true from containers?

youngpabl0 commented 1 year ago

@dorinand hi, i cant remove tty: true, the reason is some containers with cpp needs tty: true for properly working, thank u) loki-docker-driver install doesnt fix this issue too

youngpabl0 commented 1 year ago

7829 - guyz check it, work properly :)

thanks for help @scripter-v 😄

jeschkies commented 1 year ago

Well, @youngpabl0 and @scripter-v I didn't address this one because I wasn't sure how do fix the issue without much work but this is really elegant! Thanks!

I approved the PR it just needs some fix for the linter.

Is docker_sd seen as experimental? Should we target the files instead?

@Nkmol targeting the log files is always the safest approach as these have been persisted. However, sometimes you don't want to and cannot give promtail permissions to read the container log files. This is where docker_sd comes into play, IMHO. docker_sd should not be experimental. We do not use it internally, though.

erichiller commented 1 year ago

Is there any workaround?

Most containers do not need tty or interactive mode- it's there for demo purposes during container setup. The only containers that need it are those that absolutely must think they're running live in a terminal. Simply remove tty: true from your docker compose and test the container. If it works, you're good to go.

In my case I am using docker_sd_configs to monitor a loki container without using docker-compose, so I am unsure how to remove tty:true in this case.

I don't want to use

    static_configs:
      - targets:
          - localhost
        labels:
          __path__: /var/lib/docker/containers/*/*-json.log

because the loki container is the only container I couldn't figure out how to have the logs written to a file rather than to the console. So I only want to get the loki container logs and nothing else on the docker host, I can't really filter via path because the docker container id can change.

jeschkies commented 1 year ago

@erichiller I'm sorry. I thought we've landed the fix. The workaround should not be required anymore. Let me check the PR.

Nkmol commented 1 year ago

@jeschkies @MasslessParticle Would it be an idea to backport this to release-2.7.x?