Open davidonlaptop opened 9 years ago
For logging, can't we just use another "volume" instruction?
@flangelier: that would work. But I think it is worth looking into the docker logging best practices to understand the whole picture. It seems people have been using more elaborate solutions - why?: https://www.google.ca/webhp?sourceid=chrome-instant&ion=1&espv=2&ie=UTF-8#q=docker%20logging
for reference, here's how to configure the log directory for Hadoop: http://stackoverflow.com/questions/22856958/hadoop-logs-value-of-environment-variable-hadoop-log-dir
All the container state
I've got news for you buddy, the containers state is the entire filesystem it resides on. Any part of that may be of interest to someone for whatever reason...if your main command starts crashing for some reason you may want to go on as /bin/bash and have a real look around and see what happened. Docker makes that impossible for no good reason.
Docker makes that impossible for no good reason.
To get shell :
$ docker ps # check container interesting container
$ docker exec -ti <container_id> /bin/bash
Cheers
@Mikefaille,
This command would not work if you're container crashes on start. If that happens you're stuck, there is no way you can exec into the container. For this to happen, docker would need to allow overriding the entrypoint parameter on docker start command.
On Wed, Sep 2, 2015 at 8:55 AM, mikefaille notifications@github.com wrote:
Docker makes that impossible for no good reason. To get shell : $ docker ps # check container interesting container $ docker exec -ti /bin/bash
Cheers
— Reply to this email directly or view it on GitHub https://github.com/GELOG/adamcloud/issues/13#issuecomment-137064210.
You're right. I was in potatoes. (french expression)
I'm guessing the problem only occurs when the entrypoint is not interactive, right?
A workaround would be to always have an interactive entrypoint like /bin/bash to keep the container alive. Then, use this to enter a shell for a stopped container:
$ docker start -ai <container_id>
That's fine for debugging, but not in production.
I think the best solution is to just mount (using "-v") everything that can have a debugging value: data, log files, etc.
The /var partition could be in fact mounted.
Also, you can look at the "docker logs" command (https://docs.docker.com/reference/commandline/logs/) that shows the output. Can be useful to understand why a container has crashed.
docker logs show only ouput from PID1's stdout. My 2 cents. It could be useful for sure.
Maybe, starting ENTRYPOINT manually from docker exec -ti <image_name> /usr/bin/bash
could reveal some info too.
Also, while each docker images is identical, for most case, we must expect same behavior between stopped/crashed container and container started from docker exec
as I point.
In production, I personally prefer logstash + Elasticsearch + Kibana to debug from logs.
The docker log command is indeed very nice. Actually, do you know if Docker store these logs ad vitam eternam or does it truncate it after a while?
With a big data system the logs can be gigantic, we may want to find a way to force docker to clean the stdout log after a while...
On Thu, Sep 3, 2015 at 4:47 PM, mikefaille notifications@github.com wrote:
In production, I personally prefer logstash + Elasticsearch + Kibana.
— Reply to this email directly or view it on GitHub https://github.com/GELOG/adamcloud/issues/13#issuecomment-137569956.
So, remote logging on distributed storage like ElasticSearch is the way to go.
Agreed, but will Docker, in the default behavior, fill my RAM with the log messages. Say I'll leave Hadoop running for a month... On Sep 3, 2015 6:21 PM, "mikefaille" notifications@github.com wrote:
So, remote logging on distributed storage like logstash is the way to go.
— Reply to this email directly or view it on GitHub https://github.com/GELOG/adamcloud/issues/13#issuecomment-137589708.
It's why collecting logs remotely save local space issue. Additional, Logstash enable log filter and formatting for standardization/correlation. However, http://manpages.ubuntu.com/manpages/trusty/man8/logrotate.8.html can be an answer without remote storage or volume usage.
Actually, do you know if Docker store these logs ad vitam eternam or does it truncate it after a while?
The answer is ad vitam eternam by default, but good news! Since Docker 1.8, you can roll over the logs and even use a different logging driver: https://docs.docker.com/reference/logging/overview/
Also, here's an interesting article about Docker logging with ELK: http://nathanleclaire.com/blog/2015/04/27/automating-docker-logging-elasticsearch-logstash-kibana-and-logspout/
Syslog, journald, gelf, fluend... sounds like i'll have some readings to do.
Interesting article. I think it make sense to not store the logs on disk and stream it directly to logstash!
I think the default for HDFS / HBase is to log only to STDOUT / STDERR and the way we do it there is only a single process per container, so that should work. For MapReduce, I think we may have more than one process per container, when an ApplicationMaster is started for instance. However, the latest version of Hadoop can allow YARN to launch Docker container... but we'd have to check if it can launch containers while being itself in a container.
On Thu, Sep 3, 2015 at 9:14 PM, Sébastien Bonami notifications@github.com wrote:
Actually, do you know if Docker store these logs ad vitam eternam or does it truncate it after a while?
The answer is ad vitam eternam by default, but good news! Since Docker 1.8, you can roll over the logs and even use a different logging driver: https://docs.docker.com/reference/logging/overview/
Also, here's an interesting article about Docker logging with ELK: http://nathanleclaire.com/blog/2015/04/27/automating-docker-logging-elasticsearch-logstash-kibana-and-logspout/
— Reply to this email directly or view it on GitHub https://github.com/GELOG/adamcloud/issues/13#issuecomment-137615554.
However, the latest version of Hadoop can allow YARN to launch Docker container... but we'd have to check if it can launch containers while being itself in a container.
As I listen from twitter web conference; they use docker inception (docker from docker) stack since one year. I think it's ok :-) (sources is available if you ask).
For lauch map reduce task, maybe, Kuberneties on baremetal could be the best bet to bootstrap organic docker cluster. http://kubernetes.io/ I play with it theses days.
@sebastienbonami Thank you for pointing me logging drivers ! Il like this :+1:
Hi @leonty,
You may be able to access your container's data if you have put your data in a folder managed, either by:
In the above cases, you may just create a new container and access the data.
If you didn't the above, you may be able to find your data in a path on your host directory where docker stores its data. For AUFS backend, it is relatively easy as it store just files.
On Fri, Oct 28, 2016 at 6:32 AM, leonty notifications@github.com wrote:
Accessing a stopped container filesystem would be very useful. For me this would solve a problem of lock files that PGPool2, living in a container, leaves on the filesystem after crashing thus making the container impossible to run again.
— You are receiving this because you authored the thread. Reply to this email directly, view it on GitHub https://github.com/GELOG/adamcloud/issues/13#issuecomment-256888948, or mute the thread https://github.com/notifications/unsubscribe-auth/AAAqbMgvE2YL5jX77ImuAByve9im41IDks5q4c81gaJpZM4Dyhe0 .
@davidonlaptop thank you, I believe that was a question more for the docker project, so I replace it there.
yes, but I thought I could help you :-)
Problem
Docker does not (currently as of v1.5) allows a user to run a new command in a stopped container.
If the container is running, we can
docker exec
to spawn a new command in the container from the hostdocker attach
ornsenter
to "enter" the container and manually run a commandIf the container is stopped, we can
docker start
to restart the container with the same command used to launch it (viadocker run
,docker create
, or as specified by the Dockerfile instructionCMD
orENTRYPOINT
).docker restart
works the same way asdocker start
except for a running container.The use case would be to start a shell to debug the log files and other state on the disk after the process has crashed. (Or to explore a data-only container.)
See https://github.com/jpetazzo/nsenter/issues/27, https://github.com/docker/docker/issues/1437, https://github.com/docker/docker/issues/1228
Explanation
Containers should be ephemeral The container produced by the image your Dockerfile defines should be as ephemeral as possible. By “ephemeral,” we mean that it can be stopped and destroyed and a new one built and put in place with an absolute minimum of set-up and configuration.
https://docs.docker.com/articles/dockerfile_best-practices/
Solution
All the container state (data, logging) that requires persistence MUST be stored outside the container's main layer.