Open dani opened 5 months ago
Maybe one way to mitigate this would be to have a 3rd mode for log collection which would be on-demande : as soon as the log streaming API is called, the corresponding logmon & docker_logger process could be spawned (and which would be killed after some timeout)
This is a clever idea, but the challenge is that the the logmon/docker_logger are just attaching to stdout/stderr of the container. If nothing is reading from those file handles, then the application will not be able to write those logs (potentially causing the entire application to block, but at the very least buffering up a ton of logs). Likewise, we need to attach the logmon so that we can rotate logs safely without dropping any, otherwise a given task can use more than the allowed disk space.
The long-term approach we want to take to this is logging plugins. A design doc from a hack branch I did of this can be found here. A couple of other thoughts along those lines:
journald
logger, where we're write logs directly to the journal and let the journal's own rate limiting take over.In any event, I'll label this as another logging-related idea and we'll look into this when we return to that logging plugin concept. Thanks!
As a workarround, I'm now using the fluentd logging driver, sent to a local vector instance, which write back the logs where Nomad expects (as described here ). On a small scale test cluster, this setup reduced by ~25GB (about 20% of the total) the global memory consumption, while still allowing access to logs with the Nomad API
Ha, thank you @dani. I am also capturing docker logs via the splunk exporter into vector (have to see if fluentd would be better), but never thought of writing them back to the nomad locations.
Fwiw, instead of using env variables you can also use the docker labels, so my docker plugin config looks like this:
extra_labels = ["*"]
logging {
type = "splunk"
config {
splunk-token = "localhost-splunk-token"
splunk-url = "http://127.0.0.1:8089"
splunk-verify-connection = "false"
labels-regex = "com\\.hashicorp\\..*"
}
}
and the vector configuration looks like this:
sinks:
loki:
type: loki
inputs:
- splunk
endpoint: http://localhost:3100
encoding:
codec: text
healthcheck:
enabled: false
labels:
nomad_namespace: '{{ attrs."com.hashicorp.nomad.namespace" }}'
nomad_job: '{{ attrs."com.hashicorp.nomad.job_name" }}'
nomad_group: '{{ attrs."com.hashicorp.nomad.task_group_name" }}'
nomad_task: '{{ attrs."com.hashicorp.nomad.task_name" }}'
nomad_node: '{{ attrs."com.hashicorp.nomad.node_name" }}'
nomad_alloc: '{{ attrs."com.hashicorp.nomad.alloc_id" }}'
host: "${HOSTNAME}"
log: "nomad"
At least that is the part that passes the data into loki, but you can see how to access the labels again via attrs
Indeed, I could've done this. But in my case, I also have some tasks which sends directly their logs to the same fluentd source, and only has access to the env var, not the labels. So using env everywhere allows the same vector pipeline to be used
Proposal
I just took a look at the memory usage on my Nomad agents, and realized that the overhead of Docker log collection is crazy. On my small scale cluster (personal install with 4 nomad agents, 65 alloc running), it was using about 33% of the total used memory (mainly the nomad logmon and nomad docker_logger processes. I used the reported used memory by systemd, with and without disable_log_collection = true).
Disabling log collection (and using for example fluentd for the docker task driver) is a solution to this insane consumption, but we loose access to the container logs from the web interface or the nomad alloc logs cli, which is convenient for quickly debugging (faster than querying a central log aggregator).
Maybe one way to mitigate this would be to have a 3rd mode for log collection which would be on-demande : as soon as the log streaming API is called, the corresponding logmon & docker_logger process could be spawned (and which would be killed after some timeout)
Use-cases
An on-demand log collection would suppress most of the memory overhead of log collection for the Docker driver, while still allowing logs to be displayed in the web interface or the nomad cli for ponctual debugging
Attempted Solutions
Using an out of band log collector/aggregator and turning disable_log_collection globaly in Nomad's agent conf is a workarround, but loosing access to the logs from the web interface is a serious drawback.