Document Log Drivers - Githubissues

moby / swarmkit

A toolkit for orchestrating distributed systems at any scale. It includes primitives for node discovery, raft-based consensus, task scheduling and more.

Apache License 2.0

3.37k stars 616 forks source link

Document Log Drivers #248

Open aluzzardi opened 8 years ago

aluzzardi commented 8 years ago

Docker/swarm currently just proxies logs calls back to the engine.

This is problematic since swarm is definitely not the best place to do log management.

Instead, we should do something like delegating that to log drivers. Perhaps we should simply have documentation around that, or perhaps we should have a default method.

aluzzardi commented 8 years ago

/cc @docker/fiesta-cucaracha-maintainers @ehazlett

ehazlett commented 8 years ago

This works for me. I had an idea for centralized log aggregation using something like this as a feature anyway. I think this would be fine and actually make it better.

stevvooe commented 8 years ago

Expanding on the proposal in https://github.com/docker/swarm-v2/issues/251#issuecomment-206527209, we should use ssh to support logs, and any other daemon-homed behavior. From that proposal, we extend our command with the following:

swarmctl ssh <task id/name> attach
swarmctl ssh <task id/name> logs
swarmctl ssh <task id/name> inspect

The ssh command becomes a way to route and interact with a specific daemon instance.

aluzzardi commented 8 years ago

Point being we should probably stay away from log management in a cluster orchestrator.

Instead, we should facilitate the use of log drivers (which the engine already has) and let the operator use their favorite log management tool which allows indexing, searching, etc (e.g. fluentd+elasticsearch).

The goal of this issue is not to define a way to manage logs but rather to come up with a solid story: we should provide recipes and make sure they are well supported by Swarm.

For instance, taking the fluentd/elasticsearch example: We could come up with a yml file that defines a ServiceJob that brings up an ElasticSearch and a GlobalJob that spawns fluentd containers into every node, configured to send the logs to ElasticSearch (depends on Service Discovery). Then we explain how to set the correspondinglog driver in further .yml specs (or in swarm settings globally?) so that containers actually use the underlying fluentd we just set up.

stevvooe commented 8 years ago

@aluzzardi The proposal above would just direct all the commands to the engine, directly.

aluzzardi commented 8 years ago

Similar to metrics, this should be a well supported recipe.

/cc @sfsmithcha @mgoelzer

bfirsh commented 8 years ago

The first thing a user will think when they start a service is "how do I view the logs?". I ran into this immediately after starting my first service. It's ingrained from how effortless it is to get output from Docker containers (docker run ubuntu echo hello world) and similar tools (heroku logs, etc).

I like the idea of deferring to log drivers. If log drivers let us read logs, then we could then do a nice end-to-end CLI. (docker service foo logs or whatever)

My feeling is that there should at least be a default way of reading logs from containers, but perhaps we can leave this up to the editions to configure. It could even be something as simple as a syslog server running on managers inside a container, or something like that.

ghost commented 8 years ago

We've had this idea in the past of "recipes" for ops. Maybe we should have a recipe around setting up a swarm-wide syslog server and sending all your docker logs to it? You actually could run syslogd on the swarm, although there's a chicken-and-egg problem there.

aluzzardi commented 8 years ago

We do support log drivers in 1.12 (/cc @stevvooe @mgoelzer)

However:

We need to do integrated log management in 1.13 (will open an issue for that)
We should document with examples on how to work with log drivers and swarm.

For the latter, @stevvooe, could you work with @sfsmithcha to come up with something? A couple of paragraphs with an example would go a long way