moby / moby

The Moby Project - a collaborative project for the container ecosystem to assemble container-based systems
https://mobyproject.org/
Apache License 2.0
68.66k stars 18.65k forks source link

output buffering or timestamp granularity issue (log lines in the wrong order) #31706

Open tianon opened 7 years ago

tianon commented 7 years ago

Description

When outputting to stdout and stderr in quick succession without a TTY, log/output lines are consistently returned in the wrong order.

Steps to reproduce the issue:

I started by reproducing with perl, but realized I could do so with just awk, which simplifies the testing and makes the explicit flushing more obvious (and makes it easier to embed in a test later, if we can figure out what's causing it and fix it):

$ docker run --rm debian:jessie awk '
    BEGIN {
        print "1 (out)" > "/dev/stdout";
        fflush("/dev/stdout");

        print "2 (err)" > "/dev/stderr";
        fflush("/dev/stderr");

        print "3 (out)" > "/dev/stdout";
        fflush("/dev/stdout");

        print "4 (err)" > "/dev/stderr";
        fflush("/dev/stderr");

        exit;
    }'

Describe the results you received:

1 (out)
3 (out)
2 (err)
4 (err)

Describe the results you expected:

1 (out)
2 (err)
3 (out)
4 (err)

(which is exactly what I get if I add --tty, but then Docker's only getting a single stream of all output rather than separate stdout and stderr, and I can't specify --tty on RUN in a Dockerfile, which is the place I see this issue in real life and it causes actual pain by lumping errors for a long RUN line together at the beginning or end without all the normal output which is the context for what the error means)

Additional information you deem important (e.g. issue happens only occasionally):

This is definitely similar to https://github.com/docker/docker/issues/26986 (possibly the same), but I have good reason to believe it's not due to buffering outside Docker (which is why that issue was closed), and appears instead to be due to Docker itself doing something odd (either doing its own buffering, or not storing timestamps with enough granularity in order to accurately recreate the proper line ordering). This is the reason why my example awk uses fflush explicitly (to make sure there's no output buffering on the awk/container side).

Output of docker version:

$ docker version
Client:
 Version:      17.03.0-ce
 API version:  1.26
 Go version:   go1.7.4
 Git commit:   60ccb22
 Built:        Mon Mar  6 16:06:31 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.03.0-ce
 API version:  1.26 (minimum version 1.12)
 Go version:   go1.7.4
 Git commit:   60ccb22
 Built:        Mon Mar  6 16:06:31 2017
 OS/Arch:      linux/amd64
 Experimental: false
$ docker version
Client:
 Version:      17.03.0-ce
 API version:  1.26
 Go version:   go1.7.5
 Git commit:   60ccb2265
 Built:        Thu Feb 23 10:58:26 2017
 OS/Arch:      linux/amd64

Server:
 Version:      17.03.0-ce
 API version:  1.26 (minimum version 1.12)
 Go version:   go1.7.5
 Git commit:   60ccb2265
 Built:        Thu Feb 23 10:58:26 2017
 OS/Arch:      linux/amd64
 Experimental: false

Output of docker info:

$ docker info
Containers: 20
 Running: 18
 Paused: 0
 Stopped: 2
Images: 4048
Server Version: 17.03.0-ce
Storage Driver: aufs
 Root Dir: /var/lib/docker/aufs
 Backing Filesystem: extfs
 Dirs: 3426
 Dirperm1 Supported: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: active
 NodeID: 5nqmhj66diohvhvp91dsuvex9
 Is Manager: true
 ClusterID: b2ruotz7bsmx8congfwdtiff2
 Managers: 1
 Nodes: 1
 Orchestration:
  Task History Retention Limit: 5
 Raft:
  Snapshot Interval: 10000
  Number of Old Snapshots to Retain: 0
  Heartbeat Tick: 1
  Election Tick: 3
 Dispatcher:
  Heartbeat Period: 5 seconds
 CA Configuration:
  Expiry Duration: 3 months
 Node Address: 10.255.136.50
 Manager Addresses:
  10.255.136.50:2377
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: v0.2.5 (expected: 977c511eda0925a723debdc94d09459af49d082a)
runc version: c91b5be (expected: a01dafd48bc1c7cc12bdb01206f9fea7dd6feb70)
init version: N/A (expected: 949e6facb77383876aeff8a6944dde66b3089574)
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.4.52-gentoo
Operating System: Gentoo/Linux
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 31.4 GiB
Name: nameless
ID: XTOG:WDQV:MC46:UH5F:CQXH:VTNK:CDR6:7BEE:JDRJ:4U2D:LUQM:EYN2
Docker Root Dir: /var/lib/docker
Debug Mode (client): false
Debug Mode (server): false
Username: tianon
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false
$ docker info
Containers: 9
 Running: 9
 Paused: 0
 Stopped: 0
Images: 137
Server Version: 17.03.0-ce
Storage Driver: overlay2
 Backing Filesystem: extfs
 Supports d_type: true
 Native Overlay Diff: true
Logging Driver: json-file
Cgroup Driver: cgroupfs
Plugins: 
 Volume: local
 Network: bridge host macvlan null overlay
Swarm: inactive
Runtimes: runc
Default Runtime: runc
Init Binary: docker-init
containerd version: 977c511eda0925a723debdc94d09459af49d082a
runc version: a01dafd48bc1c7cc12bdb01206f9fea7dd6feb70
init version: 949e6fa
Security Options:
 seccomp
  Profile: default
Kernel Version: 4.9.0-1-amd64
Operating System: Debian GNU/Linux 9 (stretch)
OSType: linux
Architecture: x86_64
CPUs: 8
Total Memory: 31.45 GiB
Name: viper
ID: 3GRQ:ZZAL:3Y3M:LKO5:AYFL:EO5G:UE2G:7MWM:G7C7:ZM46:WI5L:A6GZ
Docker Root Dir: /mnt/docker/debian-var-lib-docker
Debug Mode (client): false
Debug Mode (server): false
Username: tianon
Registry: https://index.docker.io/v1/
Experimental: false
Insecure Registries:
 127.0.0.0/8
Live Restore Enabled: false

Additional environment details (AWS, VirtualBox, physical, etc.):

Sometimes the output is even more exciting:

2 (err)1 (out)

3 (out)
4 (err)
tianon commented 7 years ago

Ok, here's an additional data point that's interesting; I can reproduce down to Docker 1.0 :sob:

$ ./output-buffering.sh
Docker 17.03: unexpected result: 2
4
1
3
Docker 1.13: unexpected result: 1
3
2
4
Docker 1.12: unexpected result: 2
1
3
4
Docker 1.11: unexpected result: 1
3
2
4
Docker 1.10: unexpected result: 2
4
1
3
Docker 1.9: unexpected result: 1
3
2
4
Docker 1.8: unexpected result: 1
3
2
4
Docker 1.7: unexpected result: 1
3
2
4
Docker 1.6: unexpected result: 1
2
4
3
Docker 1.5: unexpected result: 1
3
2
4
Docker 1.4: unexpected result: 1
3
2
4
Docker 1.3: unexpected result: 1
23

4
Docker 1.2: unexpected result: 1
2
4
3
Docker 1.1: unexpected result: 2
4
1
3
Docker 1.0: unexpected result: 1
3
2
4
tianon commented 7 years ago

(That was using essentially host-docker save busybox:latest | dind-docker load, followed by capturing the output of dind-docker run --rm busybox awk 'BEGIN { print "1" > "/dev/stdout"; fflush("/dev/stdout"); print "2" > "/dev/stderr"; fflush("/dev/stderr"); print "3" > "/dev/stdout"; fflush("/dev/stdout"); print "4" > "/dev/stderr"; fflush("/dev/stderr"); exit }' 2>&1)

thaJeztah commented 7 years ago

ping @dperny this is what I was talking about last week :smile: (thanks @tianon, good timing, haha)

dperny commented 7 years ago

Was on PTO last week, added this to my queue to review.

pjknkda commented 7 years ago

I've got this error frequently when I use python interpreter on docker, which >>> is printed as STDERR.

[pjknkda@anne]~% docker run -i --rm python:3.6 python -i
Python 3.6.0 (default, Feb 28 2017, 22:20:46) 
[GCC 4.9.2] on linux
Type "help", "copyright", "credits" or "license" for more information.
>>> a = 'asdf'
>>> a
'asdf'
>>> a
>>> 'asdf'

If I de-multiplex STDOUT and STDERR with pseudo-tty option (-t) or run python -i 2>&1, the issue does not occur.

pjknkda commented 7 years ago

Is there any update on this issue?

thaJeztah commented 7 years ago

no, I think @dperny is currently assigned on other tasks, but contributions are welcome

tianon commented 6 years ago

Just to keep the thread alive, I can still reproduce on a fresh 18.03 instance. :heart:

thaJeztah commented 6 years ago

Let’s try resurrecting @dperny :innocent:

tianon commented 4 years ago

For whatever it's worth, I can now also reproduce on containerd 1.4.0 directly: :disappointed:

root@846a3a0c541f:/# ctr version
Client:
  Version:  v1.4.0
  Revision: 09814d48d50816305a8e6c1a4ae3e2bcc4ba725a
  Go version: go1.13.15

Server:
  Version:  v1.4.0
  Revision: 09814d48d50816305a8e6c1a4ae3e2bcc4ba725a
  UUID: 969f7235-0233-45cf-b268-b061f7438fff

root@846a3a0c541f:/# ctr run --rm docker.io/library/busybox:latest test awk 'BEGIN { print "1 (out)" > "/dev/stdout"; fflush("/dev/stdout"); print "2 (err)" > "/dev/stderr"; fflush("/dev/stderr"); print "3 (out)" > "/dev/stdout"; fflush("/dev/stdout"); print "4 (err)" > "/dev/stderr"; fflush("/dev/stderr"); exit }' 2>&1
1 (out)
3 (out)
2 (err)
4 (err)

(the output of that should be 1-4 in ascending order :disappointed:)

cpuguy83 commented 4 years ago

It seems like, at least in the case of docker, this is not completely unexpected since this is coming down a single stream, copied from two separate goroutines for each stdio stream.