sematext / sematext-agent-docker

Sematext Docker Agent - host + container metrics, logs & event collector
https://sematext.com/docker
Apache License 2.0
209 stars 30 forks source link

On logsene service 503, container stopped with exit code 2 #13

Closed rocketraman closed 8 years ago

rocketraman commented 8 years ago

Saw this from the :dev container:

sematext-agent-docker-1: Container stopped with exit code 2
2016-08-04T14:19:03.835216077Z     at emitNone (events.js:91:20) 

The container automatically restarted. Is there a way to see the logs of the stopped container so I can provide the full stack trace?

Actually we have six of these over six nodes, and all of them are throwing this error within a few minutes of each other.

megastef commented 8 years ago

The "latest" Image (1.30.15) should be updated to support your JSON logs https://github.com/sematext/sematext-agent-docker/issues/12. Does it work? Feel free to post stack trace of "dev" image.

rocketraman commented 8 years ago

I don't know how to get the stack trace. By the time I see the error above as a notification from Docker Cloud, the container has already restarted, and the logs of the exited container are no longer accessible via Docker Cloud. I do see a lot of these in the logs of the running containers though -- perhaps it is related:

sematext-agent-docker-5 | 2016-08-04T14:47:31.361141497Z 2016-08-04T14:47:30.951Z - error: Error in logsene-js:  source=logsene, message=Logsene status code:503, httpStatus=503, httpBody=, url=https://logsene-receiver.sematext.com/0b792730-a746-44f6-8d8b-efcb443a84ae/_bulk
otisg commented 8 years ago

By the time the error above happens, the container has already restarted, and the logs of the exited container are no longer accessible via Docker Cloud.

Shouldn't you be able to see the logs in Logsene?

rocketraman commented 8 years ago

Shouldn't you be able to see the logs in Logsene?

Apparently no. The closest I get is this event from dockercloud/events:latest:

sending event: {"status":"die","id":"7681056fd9938b1cec6fec4a2ffb2b7baf3eb6cb52395428f76f43ee996a930a","from":"sematext/sematext-agent-docker:dev","time":1470321899834976922,"exitcode":"2"}

I guess the logsene container does not handle its own errors by sending them to logsene -- it should probably try to buffer them somewhere so that it can send them on a restart.

megastef commented 8 years ago

@otisg by default SDA logs are not logged to Logsene anymore.

otisg commented 8 years ago

Ah, sorry, I didn't realize this was about the SDA container. But one can enable SDA log shipping to Logsene, so @rocketraman could use that and capture those logs, too.

rocketraman commented 8 years ago

@otisg @megastef If configured to send logs to logsene, will the SDA container capture the logs of error conditions that prevent logging to logsene, and send them once it comes back up?

Also: the last time the SDA container exited with code 2 corresponds with the last time I see this error in the logs:

sematext-agent-docker-5 | 2016-08-04T14:47:31.361141497Z 2016-08-04T14:47:30.951Z - error: Error in logsene-js: source=logsene, message=Logsene status code:503, httpStatus=503, httpBody=, url=https://logsene-receiver.sematext.com/0b792730-a746-44f6-8d8b-efcb443a84ae/_bulk

megastef commented 8 years ago

The reported error is probably not the reason for a crash. Only "uncaught errors" with the message to contact support are critical / lead to exit. HTTP 503, is normally not critcal, logs get then buffered, maybe too long and we face a memory issue then? Do you use SPM and know SDA memory usage? @rocketraman Did you try released version 1.30.15 / latest?

megastef commented 8 years ago

@rocketraman you can get stats about logging (shipped, failed, restransmit) using -e ENABLE_LOGSENE_STATS=true

If you start SDA without "-d" it should run in foreground in the terminal and you could copy error messages from console. You might want to use -e SPM_LOG_LEVEL=debug (so we get more surrounding messages).

megastef commented 8 years ago

Feel free to share full log via e-mail stefan.thies@sematext.com - if you don't like to post it here.

rocketraman commented 8 years ago

Do you use SPM and know SDA memory usage?

SPM reports SDA container memory is relatively consistent, way below limit, and no memory allocation failures.

@rocketraman Did you try released version 1.30.15 / latest?

I haven't yet. I was seeing if I could help debug this issue first.

The containers have not died since the 503's stopped occurring. So not sure adding debug logging now will tell us anything useful, unless you can simulate the 503's for my account on the server-side, and we can see if the error happens again.

megastef commented 8 years ago

I was able to reproduce this error. @rocketraman thank you for reporting!

rocketraman commented 8 years ago

Awesome, is this in dev already? Can you post here when this is in latest?

megastef commented 8 years ago

Testing v1.30.16 right now.

RE: storing failed logs over restarts. If you want to make sure that logs are shipped after restarts, then use

-v /tmp:/logsene-log-buffer

/tmp on host stores then logs, in case of a network or service outage. Docker Agent deletes this files after successful transmission. Taken from https://github.com/sematext/sematext-agent-docker#configuration-parameters

megastef commented 8 years ago

@rocketraman latest build for 1.30.16 is available. https://hub.docker.com/r/sematext/sematext-agent-docker/builds/. Again, thank you for reporting this issue. We use a service returning 503 for tests :)