CPU usage on dockerd 100%

runningman84 commented 7 years ago

If I start logspout-logstash using this docker-compose the cpu usage spikes to 100% and stays there...

  logspout:
    image: bekt/logspout-logstash
    mem_limit: 256m
    restart: on-failure
    depends_on:
      - logstash
    environment:
      ROUTE_URIS: logstash://logstash:5000
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock

before starting logspout

 7015 root      20   0 1423204  80956  29440 S 106.3  1.3   1:25.72 dockerd

after starting logspout

 7015 root      20   0 1423204  80972  29440 S   0.3  1.3   1:58.57 dockerd

garceri commented 7 years ago

Experiencing the same issue here:

$logspout --version
v3.2-dev-custom

$docker version Client: Version: 1.12.3 API version: 1.24 Go version: go1.6.3 Git commit: 6b644ec Built: Wed Oct 26 23:26:11 2016 OS/Arch: linux/amd64

Server: Version: 1.12.3 API version: 1.24 Go version: go1.6.3 Git commit: 6b644ec Built: Wed Oct 26 23:26:11 2016 OS/Arch: linux/amd64

[root@ip-XXXXXXX rancher]# docker info Containers: 56 Running: 44 Paused: 0 Stopped: 12 Images: 89 Server Version: 1.12.3 Storage Driver: overlay Backing Filesystem: extfs Logging Driver: json-file Cgroup Driver: cgroupfs Plugins: Volume: local rancher-nfs Network: null bridge overlay host Swarm: inactive Runtimes: runc Default Runtime: runc Security Options: seccomp Kernel Version: 4.4.24-rancher Operating System: RancherOS v0.7.1 OSType: linux Architecture: x86_64 CPUs: 2 Total Memory: 14.94 GiB Name: ip-XXXXXXXXXXX.ec2.internal ID: DVSV:NLW5:VD5U:UOYY:FIOB:4RTH:7NGQ:GIBO:KKTR:VPYO:J6A2:4SZE Docker Root Dir: /var/lib/docker Debug Mode (client): false Debug Mode (server): false Registry: https://index.docker.io/v1/ Insecure Registries: 127.0.0.0/8

Logspout container logs with DEBUG enabled show a constant stream of:

1/26/2017 12:37:44 PM2017/01/26 17:37:44 pump.pumpLogs(): 495069bafc93 stopped
1/26/2017 12:37:45 PM2017/01/26 17:37:45 pump.pumpLogs(): 495069bafc93 started
1/26/2017 12:37:45 PM2017/01/26 17:37:45 pump.pumpLogs(): 25635c95f804 stopped
1/26/2017 12:37:45 PM2017/01/26 17:37:45 pump.pumpLogs(): 25635c95f804 started
1/26/2017 12:37:46 PM2017/01/26 17:37:46 pump.pumpLogs(): 495069bafc93 stopped
1/26/2017 12:37:46 PM2017/01/26 17:37:46 pump.pumpLogs(): 495069bafc93 started
1/26/2017 12:37:46 PM2017/01/26 17:37:46 pump.pumpLogs(): a27d4bfb3a41 stopped
1/26/2017 12:37:46 PM2017/01/26 17:37:46 pump.pumpLogs(): a27d4bfb3a41 started
1/26/2017 12:37:46 PM2017/01/26 17:37:46 pump.pumpLogs(): 32e734dcf238 stopped
1/26/2017 12:37:46 PM2017/01/26 17:37:46 pump.pumpLogs(): 32e734dcf238 started
1/26/2017 12:37:47 PM2017/01/26 17:37:47 pump.pumpLogs(): 495069bafc93 stopped
1/26/2017 12:37:47 PM2017/01/26 17:37:47 pump.pumpLogs(): 495069bafc93 started
1/26/2017 12:37:48 PM2017/01/26 17:37:48 pump.pumpLogs(): 495069bafc93 stopped
1/26/2017 12:37:48 PM2017/01/26 17:37:48 pump.pumpLogs(): a27d4bfb3a41 stopped
1/26/2017 12:37:48 PM2017/01/26 17:37:48 pump.pumpLogs(): 495069bafc93 started
1/26/2017 12:37:49 PM2017/01/26 17:37:48 pump.pumpLogs(): a27d4bfb3a41 started

mattatcha commented 7 years ago

@pythianarceri Are you able to provide a docker-compose.yml I can use to reproduce the issue?

garceri commented 7 years ago

We use a custom built image compiled with the logstash output module,

logspout:
    image: pythiant9shared/logspout-logstash
    environment:
      DEBUG: '1'
      ROUTE_URIS: logstash://logstash:5000
    stdin_open: true
    external_links:
    - logstash/logstash-collector:logstash
    volumes:
    - /var/run/docker.sock:/var/run/docker.sock
    tty: true
    labels:
      io.rancher.scheduler.global: 'true'
      io.rancher.container.hostname_override: container_name

garceri commented 7 years ago

Any luck getting this fixed or at least the issue identified, is it normal for the pumps to be started/stopped all the time ? We are running under RancherOS 1.3.3 just in case

mattdodge commented 7 years ago

Same thing here. I end up seeing a lot of these messages in /var/log/docker.log

time="2017-03-13T21:36:26.491710944Z" level=error msg="Error streaming logs: invalid character '\\x00' looking for beginning of value"
time="2017-03-13T21:36:26.514911587Z" level=error msg="Error streaming logs: invalid character '\\x00' looking for beginning of value"
time="2017-03-13T21:36:26.547530212Z" level=error msg="Error streaming logs: invalid character '\\x00' looking for beginning of value"

It's happened a couple times. Sometimes that "invalid character" is different but the gist of the log message is the same. We see dockerd up around 100% CPU and a constant stream of those errors.

byronwilliams commented 7 years ago

Hi all,

We've managed to reproduce the same issue as @pythianarceri - on Jan 26. To add some detail; running a docker logs -f {ContainerID} does not actually follow the logs and exits straight away. This would make sense with the pumpLogs starting and stopping as the HTTP connection being closed for some reason.

This is reproducible on the latest master commit (7c22df4e9d780959b266d7a088ff8012171edbfb) as well as 513b74c70a31c20294507f3fb21c34e52b2df48e and 8b98313954b3283d7203d1c3984b5b69ae307e0b.

Our container is running a few processed inside, all of which have PYTHONUNBUFFERED=1.

Whilst I consider this to be a bug within our container not outputting to stdout and stderr correctly it also makes our docker nodes perform really badly when the containes are scheduled.

Can we add some kind of counter when the pump starts and if it restarts {n} times within a second make it backoff?.

# docker version
Client:
 Version:      17.03.0-ce
 API version:  1.26
 Go version:   go1.7.4
 Git commit:   60ccb22
 Built:        Mon Mar 13 18:07:51 UTC 2017
 OS/Arch:      linux/arm

Server:
 Version:      17.03.0-ce
 API version:  1.26 (minimum version 1.12)
 Go version:   go1.7.4
 Git commit:   60ccb22
 Built:        Mon Mar 13 18:07:51 UTC 2017
 OS/Arch:      linux/arm
 Experimental: false

woutor commented 7 years ago

Same issue here. Reverted back to docker image v3.1 and logging and CPU usage is back to normal.

sebgl commented 7 years ago

Same issue, we also reverted to docker image v3.1.

We are running Docker version 17.04.0-ce, build 4845c56, and deploying our logspout container with the new swarm mode ('docker stack deploy'). We're not using logstash, but forward logs to a remote syslog directly.

Here is the content of our docker-compose.yml file:

version: '3'

services:
  logspout:
    image: gliderlabs/logspout
    environment:
      SYSLOG_STRUCTURED_DATA: "- $SYSLOG_CONF"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock
    ports:
      - 10888:80
    command: syslog+tls://$LOGS_ENDPOINT
    deploy:
      mode: global

gaieges commented 7 years ago

I have the same problem, using the most recent master image (ef55cfcceda7) and version v3.3-dev on both docker 17.04.0-ce and 1.13.1.

It's fairly difficult to reproduce, and seems to occur only after a period of time - I'm not sure whether its a result of complex log messages (or potentially unexpected charsets), or just the volume of messages shipped off.

During the time that it's at high cpu usage, I also have seen a number of weird messages such as what mattdodge described

@woutor @sebgl after you reverted back to version 3.1 - did you see the problem pop up again?

woutor commented 7 years ago

@gaieges: Nope, with 3.1 the CPU usage remains normal.

gdurr commented 7 years ago

Same issue there with Docker 17.05.0-ce and logspout 3.3. We can't go back to 3.1 cause we need label filtering feature.

avoinea commented 7 years ago

Same issue here. Reverted back to docker image v3.1 and logging and CPU usage is back to normal.

michaelshobbs commented 7 years ago

I'm using v3.3 in production with papertrail. Is there anybody experiencing this issue that is NOT using logstash? I'm thinking the problem might be in that module...

gaieges commented 7 years ago

@michaelshobbs I don't use the logstash module, I just use the tcp+syslog output.

michaelshobbs commented 7 years ago

Interesting....anybody have any thoughts on this one?

SqiSch commented 7 years ago

Same issue here, but i cannot ack the high cpu load. Only a lot loglines in /var/log/syslog and the pump.pumpLogs() when debugging enabled.

i also using the logstash adapter

.....
165+02:00" level=error msg="Error streaming logs: invalid character 'l' after object key:value pair"
Jun 16 11:55:18 xyz docker[11404]: time="2017-06-16T11:55:18.307515574+02:00" level=error msg="Error streaming logs: invalid character 'l' after object key:value pair"
Jun 16 11:55:18 xyz docker[11404]: time="2017-06-16T11:55:18.432245150+02:00" level=error msg="Error streaming logs: invalid character 'l' after object key:value pair"
....

mrbobbytables commented 7 years ago

We encountered the same issue today randomly on one of hosts. I unfortunately do not have anymore info than what others have reported. Docker 17.03.1-ce RancherOS v1.0.2 logspout configured to send to syslog+tcp

sfrique commented 7 years ago

Hello folks,

I was having the same problem:

Jun 28 14:07:20 xxx docker[7686]: time="2017-06-28T14:07:20.892703572Z" level=error msg="Error streaming logs: invalid character 'l' after object key:value pair"
Jun 28 14:07:20 xxx docker[7686]: time="2017-06-28T14:07:20.894972987Z" level=error msg="Error streaming logs: invalid character 'l' after object key:value pair"

and it turns out to be corrupt json-file.log, it happened when I got a disk full so json-file.log was truncated. I had some thing like this at my json-file.log after disk space got fixed and docker running again.

{"log":"    2017/06/28 07:46:00 [INFO] consul: adding WAN ser{"log":"    2017/06/28 07:46:00 [INFO] consul: adding WAN server xxx.dc1 (Addr: 10.240.0.21:8300) (DC: dc1)\n","stream":"stdout","time":"2017-06-28T07:46:00.5738544Z"}

So it isn't a valid json and it breaks docker logs, somehow it make logspout enter a loop and make docker print those lines.

If you try to run, docker logs $container on the container with that corrupt json-file.log, you will get one line of the error.

So look for corrupted/bad formatted json-files.log

I just removed that line and everything was working normally again.

Hope that helps.

gaieges commented 7 years ago

On @sfrique 's note - I wonder if using journald log driver (if possible) instead of json-file would help this?

alhafoudh commented 7 years ago

@gaieges I tried with journald, same problem

pauln commented 7 years ago

We're seeing the same thing - 100% CPU usage when running logspout (pumping to logstash). Enabling logspout's debug logs showed rapid repetition of log entries such as:

21/07/2017 11:57:202017/07/20 23:57:20 pump.pumpLogs(): fb6033455fd0 stopped
21/07/2017 11:57:202017/07/20 23:57:20 pump.pumpLogs(): fb6033455fd0 started

Initially, we were seeing this on most of our hosts, with two container IDs showing up on each. The containers in question were Rancher's metadata and DNS containers. We upgraded Rancher from 1.6.2 to 1.6.5 and upgraded all the system services, and logspout now seems to run happily on most of our hosts. However, we're still getting the stopped/started log spam (and corresponding CPU load) on our DB hosts - with only a single container ID flopping: mongodb.

garceri commented 7 years ago

The scenario described by @pauln is exactly what we are experiencing, (we haven't had chance to upgrade to 1.6.5 yet), Out of curiosity @pauln , what docker version are you running on your hosts ?

pauln commented 7 years ago

@garceri, we're currently running Docker 17.03 CE, though we're looking at upgrading to 17.06 CE in the near future.

robcza commented 7 years ago

My thoughts on this issue:

it is quite serious, because it can affect the whole host and performance of other containers
according to reaction of other people does not affect older logspout versions -> seems could be prevented on the logspout level. What is the difference between versions?
malformed logs seem to be the trouble. How they emerge? Could it be prevented on the docker daemon level?
can I help gathering any data while this happens? Running with DEBUG: true shows just the constant stream of pump start/stop

2017/07/26 07:09:48 pump.pumpLogs(): 21c90bb7ac5c stopped
2017/07/26 07:09:48 pump.pumpLogs(): 21c90bb7ac5c started
2017/07/26 07:09:48 pump.pumpLogs(): 21c90bb7ac5c stopped
2017/07/26 07:09:48 pump.pumpLogs(): 21c90bb7ac5c started

michaelshobbs commented 7 years ago

@robcza thanks for the additional details. can you provide a full debug log from start to stop of the logspout container that has entered into this state? looking for any other clues here.

To your question about difference between versions, we tag each release and thus you can browse the history by comparing tags. Additionally, we have a CHANGELOG.md with convenient links to compare each version.

ko-christ commented 7 years ago

Same for me on docker 17.03.1-ce using this image bekt/logspout-logstash:latest.

I have managed to get a full debug log since logspout container creation. There is a constant start / stop stream for c21bde2e91e7.
I had this issue on other machines as well with various services.

# docker logs elk-logspout 
2017/07/28 10:50:57 setting allowTTY to: false
# logspout v3.3-dev-custom by gliderlabs
# adapters: raw udp tcp logstash
# options : debug:True persist:/mnt/routes
# jobs    : http[]:80 pump routes
# routes  :
#   ADAPTER ADDRESS     CONTAINERS  SOURCES OPTIONS
#   logstash+tcpprod00:1111             map[]
2017/07/28 10:50:57 pump.Run(): using inactivity timeout:  0s
2017/07/28 10:50:57 pump.pumpLogs(): 4dd58c94144a ignored: environ ignore
2017/07/28 10:50:57 pump.pumpLogs(): 741c807703d4 started
2017/07/28 10:50:57 pump.pumpLogs(): 03c007defa61 started
2017/07/28 10:50:57 pump.pumpLogs(): 0398b9b5542e started
2017/07/28 10:50:57 pump.pumpLogs(): 0b1b04fd17a9 started
2017/07/28 10:50:57 pump.pumpLogs(): 18b4f33a6b1d started
2017/07/28 10:50:57 pump.pumpLogs(): 18cd5d3b88cb started
2017/07/28 10:50:57 pump.pumpLogs(): dfebc3118ce3 started
2017/07/28 10:50:57 pump.pumpLogs(): e12708a548c4 started
2017/07/28 10:50:57 pump.pumpLogs(): 78423dfc9805 started
2017/07/28 10:50:57 pump.pumpLogs(): 872c498a0c2f started
2017/07/28 10:50:57 pump.pumpLogs(): 2f78048ee356 started
2017/07/28 10:50:57 pump.pumpLogs(): bcedcc0cd5d2 started
2017/07/28 10:50:57 pump.pumpLogs(): 0b8e0146d61c started
2017/07/28 10:50:57 pump.pumpLogs(): a1c7966d4ad3 started
2017/07/28 10:50:57 pump.pumpLogs(): d49788286a32 started
2017/07/28 10:50:57 pump.pumpLogs(): 41c5e05d0019 started
2017/07/28 10:50:57 pump.pumpLogs(): ce7030af1846 started
2017/07/28 10:50:57 pump.pumpLogs(): 78cff52b2cc8 started
2017/07/28 10:50:57 pump.pumpLogs(): c5b69f63a3e8 started
2017/07/28 10:50:57 pump.pumpLogs(): 52ab8eb7e542 started
2017/07/28 10:50:57 pump.pumpLogs(): f26b009798f4 started
2017/07/28 10:50:57 pump.pumpLogs(): 7b1f7d90859a started
2017/07/28 10:50:57 pump.pumpLogs(): a5b864e479b7 started
2017/07/28 10:50:57 pump.pumpLogs(): 08581c7c3829 started
2017/07/28 10:50:57 pump.pumpLogs(): 9e4497a0dcaf started
2017/07/28 10:50:57 pump.pumpLogs(): 227ca66d7f83 started
2017/07/28 10:50:57 pump.pumpLogs(): 6f6b327b7b9e started
2017/07/28 10:50:57 pump.pumpLogs(): 846ba10f0ee7 started
2017/07/28 10:50:57 pump.pumpLogs(): e672bc2d0fe2 started
2017/07/28 10:50:57 pump.pumpLogs(): f1d48ee0211f started
2017/07/28 10:50:57 pump.pumpLogs(): f8fdca2aff57 started
2017/07/28 10:50:57 pump.pumpLogs(): 5856ed01b7ff started
2017/07/28 10:50:57 pump.pumpLogs(): 57e77510b9de started
2017/07/28 10:50:57 pump.pumpLogs(): a05333be7c0f started
2017/07/28 10:50:57 pump.pumpLogs(): dcab95f7ca2f started
2017/07/28 10:50:57 pump.pumpLogs(): f28130fa353a started
2017/07/28 10:50:57 pump.pumpLogs(): 464d4d0b24bd started
2017/07/28 10:50:57 pump.pumpLogs(): 139fccfc578f started
2017/07/28 10:50:57 pump.pumpLogs(): 341f311f43c0 started
2017/07/28 10:50:57 pump.pumpLogs(): f08a73de23ab started
2017/07/28 10:50:57 pump.pumpLogs(): 7e88c606f51b started
2017/07/28 10:50:57 pump.pumpLogs(): c21bde2e91e7 started
2017/07/28 10:50:57 pump.pumpLogs(): 52fe6826666a started
2017/07/28 10:51:02 pump.pumpLogs(): c21bde2e91e7 stopped
2017/07/28 10:51:02 pump.pumpLogs(): c21bde2e91e7 started
2017/07/28 10:51:06 pump.pumpLogs(): c21bde2e91e7 stopped
2017/07/28 10:51:06 pump.pumpLogs(): c21bde2e91e7 started
2017/07/28 10:51:09 pump.pumpLogs(): c21bde2e91e7 stopped
2017/07/28 10:51:09 pump.pumpLogs(): c21bde2e91e7 started
2017/07/28 10:51:11 pump.pumpLogs(): c21bde2e91e7 stopped
2017/07/28 10:51:11 pump.pumpLogs(): c21bde2e91e7 started
2017/07/28 10:51:14 pump.pumpLogs(): c21bde2e91e7 stopped
2017/07/28 10:51:14 pump.pumpLogs(): c21bde2e91e7 started

michaelshobbs commented 7 years ago

Thank @ko-christ for more detail! Next time this happens it would be great to see if there's anything odd about the output of the container with the "thrashing" log pump. (i.e. docker logs c21bde2e91e7 in this particular case)

Same request to others that have noticed the stop/start thrash symptom.

shahar-davidson commented 7 years ago

Hi all, We've just encountered this issue recently. (logspout starts and CPU jumps to ~100%) Here are my insights after investigating this for a couple of hours: In our environment, this was caused due to malformed JSON output in the container logs (when using json-file logging). What apparently happened is that the machine ran out of space (we had a clear indication for that) and log lines could not be written properly - so we got json log line that were cut in the middle. When that happens, docker will print out the following whenever someone tries to fetch the logs (even a simple docker logs command will reproduce this)

"Error streaming logs: invalid character 'l' after object key:value pair"

Apparently, docker will print this error and stop returning the log lines, even if there were many more lines after the corrupt json line. On our machine we had about a dozen services constantly logging and this JSON line corruption was found in most of them. It seems that docker encounters this and emits an error( https://github.com/moby/moby/blob/8af4db6f002ac907b6ef8610b237879dfcaa5b7a/daemon/logs.go#L109 ) yet Logspout immediately starts requesting logs again and, again, docker bombards the docker service logs with the above error message (tens per second).

Possible workarounds for such an issue (I didn't verify these options yet):

Setting the BACKLOG env var and restarting services
Stopping the service and moving/truncating the existing log files so that they would start off clean.

So this actually seems to be a docker issue. The best way to deal with this, IMHO, is to have docker detect such cases and roll the log file and put the corrupt log file aside (while prompting a meaningful warning about this)

One thing that puzzles me is why reverting to v3.1 "fixes" this issue (I didn't try reverting back to v3.1 yet, I'm using v3.3).

pythiancastaneda commented 7 years ago

Hello, I am having the same issue. Docker log is flooded with messages like this:

time="2017-08-30T22:30:31.922624624Z" level=error msg="Error streaming logs: invalid character '\\x00' looking for beginning of value" 
time="2017-08-30T22:30:31.977376555Z" level=error msg="Error streaming logs: invalid character '\\x00' looking for beginning of value" 
time="2017-08-30T22:30:32.172320949Z" level=error msg="Error streaming logs: invalid character '\\x00' looking for beginning of value" 
time="2017-08-30T22:30:32.570219033Z" level=error msg="Error streaming logs: invalid character '\\x00' looking for beginning of value" 
time="2017-08-30T22:30:32.662642497Z" level=error msg="Error streaming logs: invalid character '\\x00' looking for beginning of value" 
time="2017-08-30T22:30:32.860254549Z" level=error msg="Error streaming logs: invalid character '\\x00' looking for beginning of value"

And the docker logs for bekt/logspout-logstash:master are:

2017/08/30 22:18:33 pump.pumpLogs(): 9972da45bf76 started
2017/08/30 22:18:33 pump.pumpLogs(): f6fddb18f835 started
2017/08/30 22:18:34 pump.pumpLogs(): 9972da45bf76 stopped
2017/08/30 22:18:34 pump.pumpLogs(): 9ab5e8cc0c36 stopped
2017/08/30 22:18:34 pump.pumpLogs(): 9972da45bf76 started
2017/08/30 22:18:34 pump.pumpLogs(): 9ab5e8cc0c36 started
2017/08/30 22:18:34 pump.Run() event: 42ad09d40141 network:connect
2017/08/30 22:18:34 pump.Run() event: orient-backu volume:mount
2017/08/30 22:18:34 pump.pumpLogs(): f6fddb18f835 stopped
2017/08/30 22:18:35 pump.pumpLogs(): f6fddb18f835 started
2017/08/30 22:18:35 pump.Run() event: orient-datab volume:mount
2017/08/30 22:18:35 pump.pumpLogs(): 9972da45bf76 stopped
2017/08/30 22:18:35 pump.Run() event: orient-log volume:mount
2017/08/30 22:18:35 pump.pumpLogs(): 9972da45bf76 started
2017/08/30 22:18:35 pump.Run() event: rancher-cni volume:mount
2017/08/30 22:18:35 pump.Run() event: efe51a4c5ed0 volume:mount
2017/08/30 22:18:36 pump.pumpLogs(): f6fddb18f835 stopped
2017/08/30 22:18:36 pump.pumpLogs(): 9972da45bf76 stopped
2017/08/30 22:18:36 pump.pumpLogs(): 9ab5e8cc0c36 stopped
2017/08/30 22:18:36 pump.Run() event: 58a80e107fc1 start
2017/08/30 22:18:36 pump.pumpLogs(): 9ab5e8cc0c36 started
2017/08/30 22:18:36 pump.pumpLogs(): 9972da45bf76 started
2017/08/30 22:18:36 pump.pumpLogs(): f6fddb18f835 started
2017/08/30 22:18:36 pump.pumpLogs(): 58a80e107fc1 started
2017/08/30 22:18:36 pump.pumpLogs(): 9972da45bf76 stopped
2017/08/30 22:18:36 pump.pumpLogs(): 9972da45bf76 started
2017/08/30 22:18:36 pump.pumpLogs(): 9ab5e8cc0c36 stopped
2017/08/30 22:18:36 pump.pumpLogs(): 9ab5e8cc0c36 started
2017/08/30 22:18:37 pump.pumpLogs(): 9972da45bf76 stopped
2017/08/30 22:18:37 pump.pumpLogs(): 9972da45bf76 started
2017/08/30 22:18:37 pump.pumpLogs(): f6fddb18f835 stopped
2017/08/30 22:18:37 pump.pumpLogs(): f6fddb18f835 started
2017/08/30 22:18:37 pump.pumpLogs(): 9ab5e8cc0c36 stopped
2017/08/30 22:18:37 pump.pumpLogs(): 9ab5e8cc0c36 started
2017/08/30 22:18:37 pump.pumpLogs(): 9972da45bf76 stopped
2017/08/30 22:18:37 pump.pumpLogs(): 9972da45bf76 started
2017/08/30 22:18:38 pump.pumpLogs(): be68d546188d stopped
2017/08/30 22:18:38 pump.pumpLogs(): be68d546188d started
2017/08/30 22:18:38 pump.pumpLogs(): f6fddb18f835 stopped
2017/08/30 22:18:38 pump.pumpLogs(): f6fddb18f835 started
2017/08/30 22:18:38 pump.pumpLogs(): 9972da45bf76 stopped
2017/08/30 22:18:38 pump.pumpLogs(): 9972da45bf76 started
2017/08/30 22:18:38 pump.pumpLogs(): 9ab5e8cc0c36 stopped
2017/08/30 22:18:38 pump.pumpLogs(): 9ab5e8cc0c36 started
2017/08/30 22:18:39 pump.pumpLogs(): 9972da45bf76 stopped
2017/08/30 22:18:39 pump.pumpLogs(): 9972da45bf76 started
2017/08/30 22:18:39 pump.pumpLogs(): 9ab5e8cc0c36 stopped
2017/08/30 22:18:39 pump.pumpLogs(): 9ab5e8cc0c36 started
2017/08/30 22:18:39 pump.pumpLogs(): f6fddb18f835 stopped
2017/08/30 22:18:39 pump.pumpLogs(): 9972da45bf76 stopped
2017/08/30 22:18:39 pump.pumpLogs(): f6fddb18f835 started
2017/08/30 22:18:39 pump.pumpLogs(): 9972da45bf76 started
2017/08/30 22:18:40 pump.pumpLogs(): 9ab5e8cc0c36 stopped
2017/08/30 22:18:40 pump.pumpLogs(): 9972da45bf76 stopped
2017/08/30 22:18:40 pump.pumpLogs(): 9972da45bf76 started
2017/08/30 22:18:40 pump.pumpLogs(): 9ab5e8cc0c36 started
2017/08/30 22:18:40 pump.pumpLogs(): f6fddb18f835 stopped
2017/08/30 22:18:40 pump.pumpLogs(): f6fddb18f835 started`

I have trimmed some of the docker-json logs and restarted the containers and it is not helping, once the logstput container starts it makes dockerd to eat almost all the CPU.

Thanks in advance!

michaelshobbs commented 7 years ago

@mattatcha thoughts on this given new info?

sfrique commented 7 years ago

@pythiancastaneda To figureout what container is giving you the trouble:

# tail -f /var/log/syslog | grep docker
Second term:
# docker logs "for each container you have"

That way you will see at logs:

time="2017-08-30T22:30:32.860254549Z" level=error msg="Error streaming logs: invalid character '\\x00' looking for beginning of value"

And then you can remove the log or the wrong line.

Hope that helps.

I recently had the same problem and i did the steps above and fixed the issue.

leriel commented 7 years ago

Thanks for the update @stevecalvert ! In my case issue was the same corrupt container logs. For me, 3.2.2 also was logging an additional error. Posting that just in case

logspout_1  | 2017/09/15 17:42:43 pump.pumpLogs(): 0e1c957edf2d started
logspout_1  | 2017/09/15 17:42:43 pump.pumpLogs(): 0e1c957edf2d stopped with error: Unrecognized input header: 3
logspout_1  | 2017/09/15 17:42:43 pump.pumpLogs(): 0e1c957edf2d started
logspout_1  | 2017/09/15 17:42:43 pump.pumpLogs(): 0e1c957edf2d stopped with error: Unrecognized input header: 3

Only way for me to fix that issue, like for others, was to remove and recreate container.