DataBiosphere / azul

Metadata indexer and query service used for AnVIL, HCA, LungMAP, and CGP
Apache License 2.0
7 stars 2 forks source link

Log messages from `docker exec` are not prefixed with container name #6015

Closed dsotirho-ucsc closed 6 months ago

dsotirho-ucsc commented 8 months ago
fields @timestamp, @message
| filter @logStream = '/var/log/messages'
| filter @message like 'docker: ' and @message not like 'marking blob'
| sort @timestamp asc
| limit 2

[
    {
        "@timestamp": "2024-03-02 14:00:03.347",
        "@message": "Mar  2 14:00:03 ip-172-71-0-215 docker: ok: down: registry: 1s, normally up"
    },
    {
        "@timestamp": "2024-03-02 14:00:03.347",
        "@message": "Mar  2 14:00:03 ip-172-71-0-215 docker: Running garbage-collect using configuration [\"/opt/gitlab/embedded/bin/registry\", \"garbage-collect\", \"/var/opt/gitlab/registry/config.yml\", \"-m\"], this might take a while..."
    }
]
fields @timestamp, @message
| filter @logStream = '/var/log/messages'
| filter @message like 'docker: Total reclaimed'
| sort @timestamp desc
| limit 1

[
    {
        "@timestamp": "2024-03-02 12:00:13.113",
        "@message": "Mar  2 12:00:08 ip-172-71-0-215 docker: Total reclaimed space: 2.897GB"
    }
]
dsotirho-ucsc commented 8 months ago

Spike to research existing documentation / Internet.

dsotirho-ucsc commented 8 months ago

From a feature request to add logging options for docker exec: https://www.github.com/moby/moby/issues/8662

say, I have a container with syslog logging driver. So everything written to stdout from that container is logged with syslog. If I now use docker exec e.g. from cron to run a periodic update task, it's output is not logged with syslog.

In the mean time you can try this little hack....

echo hello > /proc/1/fd/1

Redirect your output into PID 1's (the docker container) file descriptor for STDOUT

From a Medium.com post: Docker Exec Output Woes

When you type: docker exec -it <container> <command>

You are attaching your terminal session to that processes’ running context. The process itself is already running — and its output is bound to your logging stack via its own STDOUT channel… now if you run a NEW process (like any command) its output is bound to your terminal — because you need to see it!

This is the reason why container logs go to one place, and exec command output go to another.

This is mostly by-design, and a Linux purist might tell you you’re doing things wrong if you need to capture these outputs. That may be so… but we’d still like a workaround!

What to do Here’s a naive solution. From inside the container, run:

echo hello > /proc/1/fd/1

“hello” will now show up in the logging system configured for the container.

dsotirho-ucsc commented 8 months ago

Assignee to test suggested workaround.

dsotirho-ucsc commented 7 months ago

Tried the workaround by manually modifying the garbage collection service file /etc/systemd/system/registry-garbage-collect.service.

< ExecStart=/usr/bin/docker exec gitlab /opt/gitlab/bin/gitlab-ctl registry-garbage-collect -m
> ExecStart=/usr/bin/docker exec gitlab sh -c '/opt/gitlab/bin/gitlab-ctl registry-garbage-collect -m > /proc/1/fd/1'

The first log shown below is from a run of the service without the workaround, and the second is with the workaround in place.

https://us-east-1.console.aws.amazon.com/cloudwatch/home?region=us-east-1#logsV2:logs-insights$3FqueryDetail$3D~(end~0~start~-86400~timeType~'RELATIVE~unit~'seconds~editorString~'fields*20*40timestamp*2c*20*40message*0a*7c*20filter*20*40logStream*20*3d*20*27*2fvar*2flog*2fmessages*27*0a*7c*20filter*20*40message*20like*20*2fRunning*20garbage-collect*2f*0a*7c*20sort*20*40timestamp*20asc*0a*7c*20limit*201000~queryId~'58767287b31b48d-759b28a-4c628bf-bc17a9c-e439378acf173c550e752b~source~(~'*2faws*2fcwagent*2fazul-gitlab~'azul-trail-dev))

[
    {
        "@timestamp": "2024-03-20 23:08:23.730",
        "@message": "Mar 20 23:08:23 ip-172-21-0-99 docker: Running garbage-collect using configuration [\"/opt/gitlab/embedded/bin/registry\", \"garbage-collect\", \"/var/opt/gitlab/registry/config.yml\", \"-m\"], this might take a while..."
    },
    {
        "@timestamp": "2024-03-20 23:09:59.249",
        "@message": "Mar 20 23:09:59 ip-172-21-0-99 docker: gitlab: Running garbage-collect using configuration [\"/opt/gitlab/embedded/bin/registry\", \"garbage-collect\", \"/var/opt/gitlab/registry/config.yml\", \"-m\"], this might take a while..."
    }
]
hannes-ucsc commented 7 months ago

For demo, attempt to reproduce. Also non-exec logs are still prefixed as before.