grafana / loki

Like Prometheus, but for logs.
https://grafana.com/loki
GNU Affero General Public License v3.0
23.4k stars 3.39k forks source link

docker-driver:latest can't send container's logs to Loki. #6112

Closed DmitryShonin closed 2 years ago

DmitryShonin commented 2 years ago

Describe the bug Docker compose setup with log driver loki

logging:
  driver: "loki"
      options:
        loki-url: "http://loki:3100/loki/api/v1/push"
        loki-retries: "8"
        loki-batch-size: "600"

not able to send logs to Loki however both located in the same docker network on the same host.

To Reproduce Steps to reproduce the behavior:

  1. Started Loki (grafana/loki:main amd64)
  2. Started Promtail (NO NEED) to tail '...'
  3. Query: (NO NEED)

Expected behavior Logs from container sending to loki instance

Environment:

Screenshots, Promtail config, or terminal output

journalctl -xu docker.service

May 06 09:25:47 <hostname> dockerd[3292252]: time="2022-05-06T09:25:47Z" level=info msg="level=warn ts=2022-05-06T09:25:47.10806212Z caller=client.go:349 container_id=469032707cb0a41032c5518fe28d99d1a95f8f26df2a8b3dcc70ddaa0c6e15ad component=client host=loki:3100 msg=\"error sending batch, will retry\" status=-1 error=\"Post \\\"http://loki:3100/loki/api/v1/push\\\": dial tcp: lookup loki on 127.0.0.53:53: server misbehaving\"" plugin=f40de8c090ec1e24797af18a8145f39d652f19fe423fa54a5e9f4da709aef76f

docker plugin inspect loki

[
    {
        "Config": {
            "Args": {
                "Description": "",
                "Name": "",
                "Settable": null,
                "Value": null
            },
            "Description": "Loki Logging Driver",
            "DockerVersion": "17.09.0-ce",
            "Documentation": "https://github.com/grafana/loki",
            "Entrypoint": [
                "/bin/docker-driver"
            ],
            "Env": [
                {
                    "Description": "Set log level to output for plugin logs",
                    "Name": "LOG_LEVEL",
                    "Settable": [
                        "value"
                    ],
                    "Value": "info"
                },
                {
                    "Description": "Activate pprof debugging endpoint for the given port.",
                    "Name": "PPROF_PORT",
                    "Settable": [
                        "value"
                    ],
                    "Value": ""
                }
            ],
            "Interface": {
                "Socket": "loki.sock",
                "Types": [
                    "docker.logdriver/1.0"
                ]
            },
            "IpcHost": false,
            "Linux": {
                "AllowAllDevices": false,
                "Capabilities": null,
                "Devices": null
            },
            "Mounts": null,
            "Network": {
                "Type": "host"
            },
            "PidHost": false,
            "PropagatedMount": "",
            "User": {},
            "WorkDir": "",
            "rootfs": {
                "diff_ids": [
                    "sha256:42bf630bc73cde126fcfed7c15a9da41f7879a3b28531828c8bb36b4f6925477"
                ],
                "type": "layers"
            }
        },
        "Enabled": true,
        "Id": "f40de8c090ec1e24797af18a8145f39d652f19fe423fa54a5e9f4da709aef76f",
        "Name": "loki:latest",
        "PluginReference": "docker.io/grafana/loki-docker-driver:latest",
        "Settings": {
            "Args": [],
            "Devices": [],
            "Env": [
                "LOG_LEVEL=info",
                "PPROF_PORT="
            ],
            "Mounts": []
        }
    }
]

docker-compose.yml


version: '3.8'

services:
  ui:
    image: provectuslabs/kafka-ui:master
    hostname: ui
    container_name: ui-test
    depends_on:
      - broker
    networks:
      - default_test_net
    ports:
      - "8888:8080"
    restart: always
    environment:
      KAFKA_CLUSTERS_0_NAME: local
      KAFKA_CLUSTERS_0_BOOTSTRAPSERVERS: broker:29092
      KAFKA_CLUSTERS_0_ZOOKEEPER: zookeeper:2181

  zookeeper0:
    image: confluentinc/cp-zookeeper:6.0.0
    hostname: zookeeper
    container_name: zookeeper0
    networks:
      - default_test_net
    ports:
      - "2181:2181"
    environment:
      ZOOKEEPER_CLIENT_PORT: 2181
      ZOOKEEPER_TICK_TIME: 2000

  broker:
    image: confluentinc/cp-kafka:6.0.0
    hostname: broker
    container_name: broker0
    restart: always
    depends_on:
      - schema-registry
      - zookeeper0
    networks:
      - default_test_net
    ports:
      - "29092:29092"
      - "9092:9092"
    environment:
      KAFKA_BROKER_ID: 1
      KAFKA_ZOOKEEPER_CONNECT: 'zookeeper:2181'
      KAFKA_LISTENER_SECURITY_PROTOCOL_MAP: PLAINTEXT:PLAINTEXT,PLAINTEXT_HOST:PLAINTEXT
      KAFKA_ADVERTISED_LISTENERS: PLAINTEXT://broker:29092,PLAINTEXT_HOST://localhost:9092
      KAFKA_OFFSETS_TOPIC_REPLICATION_FACTOR: 1
      KAFKA_GROUP_INITIAL_REBALANCE_DELAY_MS: 1800
      KAFKA_TOOLS_LOG4J_LOGLEVEL: ERROR

  schema-registry:
    image: confluentinc/cp-schema-registry:6.0.0
    hostname: schema-registry
    container_name: schema-registry0
    depends_on:
      - zookeeper0
    networks:
      - default_test_net
    ports:
      - "8081:8081"
    environment:
      SCHEMA_REGISTRY_HOST_NAME: schema-registry
      SCHEMA_REGISTRY_KAFKASTORE_BOOTSTRAP_SERVERS: 'broker:29092'
      SCHEMA_REGISTRY_LOG4J_ROOT_LOGLEVEL: WARN
    logging:
      driver: "loki"
      options:
        loki-url: "http://loki:3100/loki/api/v1/push"
        loki-retries: "8"
        loki-batch-size: "600"

networks:
  default_test_net:
    external: true

docker network inspect default_test_net


[
    {
        "Name": "default_test_net",
        "Id": "2682edaa039b4a344b893df093850f44802a15b84c16062474f897351c292a86",
        "Created": "2022-04-21T15:36:15.297401977Z",
        "Scope": "local",
        "Driver": "bridge",
        "EnableIPv6": false,
        "IPAM": {
            "Driver": "default",
            "Options": null,
            "Config": [
                {
                    "Subnet": "172.19.0.0/16",
                    "Gateway": "172.19.0.1"
                }
            ]
        },
        "Internal": false,
        "Attachable": false,
        "Ingress": false,
        "ConfigFrom": {
            "Network": ""
        },
        "ConfigOnly": false,
        "Containers": {
            "44a3403182ca7070240ad415042213fbd32a6024f2d756b822dd178475f93067": {
                "Name": "cadvisor",
                "EndpointID": "bf195cfb97248172ed1f7c99a673eb98aadb62acd65a44b7e19cf3e092f860b1",
                "MacAddress": "02:42:ac:13:00:07",
                "IPv4Address": "172.19.0.7/16",
                "IPv6Address": ""
            },
            "450af3ff31c1410a2e0dad3c9d37b598da729b9469faeae028fb6308eb92c3ad": {
                "Name": "grafana",
                "EndpointID": "cbbe8c1615eb5108ac8b8816bf3c8a67d8124666b1979bedbf007d39796dd36b",
                "MacAddress": "02:42:ac:13:00:09",
                "IPv4Address": "172.19.0.9/16",
                "IPv6Address": ""
            },
            "469032707cb0a41032c5518fe28d99d1a95f8f26df2a8b3dcc70ddaa0c6e15ad": {
                "Name": "schema-registry0",
                "EndpointID": "a7172119b0240faef239d094df156b6e71f9893ed4adde3d7b63a876ab75bf93",
                "MacAddress": "02:42:ac:13:00:03",
                "IPv4Address": "172.19.0.3/16",
                "IPv6Address": ""
            },
            "4b9e0eb84b38a072e9bf88cd2d3cdcd2d40b7daa5b752c1eed187986c47543d0": {
                "Name": "alertmanager",
                "EndpointID": "cd22cae03e55228c1bfd19519b867bf99187bb22e62153e00fcfd22c1bb2c96a",
                "MacAddress": "02:42:ac:13:00:0a",
                "IPv4Address": "172.19.0.10/16",
                "IPv6Address": ""
            },
            "6b437c69afaa699cf86b55e1e122720080ad6f0c878922af00bf9b1db9a9e5cf": {
                "Name": "prometheus",
                "EndpointID": "c37c4b5445b3f2e383f2b916f9cd227f0113b5f67f6bd385b94efcd233540d8a",
                "MacAddress": "02:42:ac:13:00:08",
                "IPv4Address": "172.19.0.8/16",
                "IPv6Address": ""
            },
            "70eef9f20442f61746a54921d9588385f0986662dad9d54ddb7d445f264b339a": {
                "Name": "broker0",
                "EndpointID": "a0c1e891b348cba7255f68e4e944a3d7033afa5539051891cf6e54e873316e9d",
                "MacAddress": "02:42:ac:13:00:06",
                "IPv4Address": "172.19.0.6/16",
                "IPv6Address": ""
            },
            "82f5f5ac50be8594eecacb4946dcd68e36809cb06100abfae2dcc2c67905d2fd": {
                "Name": "caddy",
                "EndpointID": "b4084684a2ed34921d6fa8fb3d6cf54830563d9b83f15bf3e050c055a9e9810c",
                "MacAddress": "02:42:ac:13:00:0b",
                "IPv4Address": "172.19.0.11/16",
                "IPv6Address": ""
            },
            "891b1972ddee50baff39377543e699f78e4cfb36d71d269568c0c194ee719ae3": {
                "Name": "zookeeper0",
                "EndpointID": "953afa9bca3ba8dc9cc9d0c9e1b99dd4e0308170ad78eda9c5eeb362bf6acb42",
                "MacAddress": "02:42:ac:13:00:02",
                "IPv4Address": "172.19.0.2/16",
                "IPv6Address": ""
            },
            "9f114f6d1de14f4a9c066b37b64402087315581e03516e497b7ff638523891b9": {
                "Name": "ui",
                "EndpointID": "9a6cbec441b5cefdb3ec41106bddd42d56f8e7da9b17112e849596e5c342d456",
                "MacAddress": "02:42:ac:13:00:05",
                "IPv4Address": "172.19.0.5/16",
                "IPv6Address": ""
            },
            "ca47b2cb7a31867df6a91ff1598e8e94149575af80ef89d3a1fea6a3500db6b4": {
                "Name": "loki",
                "EndpointID": "6871e3883b789a724a3d43663ebc823abf3ea4f796620e1c13f528ac1a6333d5",
                "MacAddress": "02:42:ac:13:00:04",
                "IPv4Address": "172.19.0.4/16",
                "IPv6Address": ""
            }
        },
        "Options": {},
        "Labels": {}
    }
]

docker exec -it schema-registry0 /bin/bash -c "ping loki"

PING loki (172.19.0.4) 56(84) bytes of data. 64 bytes from loki.default_test_net (172.19.0.4): icmp_seq=1 ttl=64 time=0.140 ms 64 bytes from loki.default_test_net (172.19.0.4): icmp_seq=2 ttl=64 time=0.096 ms 64 bytes from loki.default_test_net (172.19.0.4): icmp_seq=3 ttl=64 time=0.086 ms 64 bytes from loki.default_test_net (172.19.0.4): icmp_seq=4 ttl=64 time=0.080 ms 64 bytes from loki.default_test_net (172.19.0.4): icmp_seq=5 ttl=64 time=0.083 ms 64 bytes from loki.default_test_net (172.19.0.4): icmp_seq=6 ttl=64 time=0.083 ms 64 bytes from loki.default_test_net (172.19.0.4): icmp_seq=7 ttl=64 time=0.091 ms 64 bytes from loki.default_test_net (172.19.0.4): icmp_seq=8 ttl=64 time=0.098 ms 64 bytes from loki.default_test_net (172.19.0.4): icmp_seq=9 ttl=64 time=0.080 ms 64 bytes from loki.default_test_net (172.19.0.4): icmp_seq=10 ttl=64 time=0.079 ms 64 bytes from loki.default_test_net (172.19.0.4): icmp_seq=11 ttl=64 time=0.080 ms


> docker exec -it schema-registry0 /bin/bash -c "curl -vIL http://loki:3100/loki/api/v1/push"

Accept: /

< HTTP/1.1 405 Method Not Allowed HTTP/1.1 405 Method Not Allowed < Date: Fri, 06 May 2022 10:04:34 GMT Date: Fri, 06 May 2022 10:04:34 GMT <

  • Connection #0 to host loki left intact

Not sure if this is kind of loki's-docker-driver bug itself but not docker daemon, but anyways any help / suggestions would be great. Thank you in advance. Screenshot 2022-05-06 at 13 10 40

dannykopping commented 2 years ago

@DmitryShonin thank you for the detailed report.

This does not look like a bug in the docker-driver, but rather an issue on your actual Loki server.

May 06 09:25:47 <hostname> dockerd[3292252]: time="2022-05-06T09:25:47Z" level=info msg="level=warn ts=2022-05-06T09:25:47.10806212Z caller=client.go:349 container_id=469032707cb0a41032c5518fe28d99d1a95f8f26df2a8b3dcc70ddaa0c6e15ad component=client host=loki:3100 msg=\"error sending batch, will retry\" status=-1 error=\"Post \\\"http://loki:3100/loki/api/v1/push\\\": dial tcp: lookup loki on 127.0.0.53:53: server misbehaving\"" plugin=f40de8c090ec1e24797af18a8145f39d652f19fe423fa54a5e9f4da709aef76f

This server misbehaving error message is likely the source of the trouble. This is happening likely in Docker or lower in your OS's resolver. You should investigate why that is the case.

https://stackoverflow.com/questions/28332845/docker-network-issue-server-misbehaving

This might give you a headstart.

I'm closing this since I see no evidence of an actual bug in Loki, but please reopen if you find evidence to the contrary.

DmitryShonin commented 2 years ago

Hi, @dannykopping thank you for clarification. Although worth to notice that same setup but source container is on the separate machine reaching Loki instance via 192.168.xxx.xxx with no issues.