grafana / loki

Like Prometheus, but for logs.
https://grafana.com/loki
GNU Affero General Public License v3.0
23.91k stars 3.45k forks source link

Loki - Journal Config Hostname Label Clutter with Initial Network Setup #7454

Open WesselAtWork opened 2 years ago

WesselAtWork commented 2 years ago

Describe the bug

I was trying to add a journal config to the existing loki-stack helm chart n daemonset mode.

I achieved partial success using the __journal__hostname meta label which provided output that looked exactly like the Kubernetes __meta_kubernetes_pod_node_name from the default values in the loki-stack helm chart. [ip-x-x-x-x.domain.internal]

The problem happens when a node is starting up or a network change occurs. Some units (Kernel and NET) output logs when the node is still configuring it's networking, so there are entries for ip-x-x-x-x and for localhost.

The localhost lines are useless because there is no other identifier that could identify a logline from localhost as originating from any other node. Your best bet is to use the time as the context for localhost lines. The ip-x-x-x-x is annoying as it doubles the cardinality for any given host. (there will be lines for ip-y-y-y-y.domain.internal and then lines for ip-y-y-y-y as that node obtains the fully qualified domain part).

There is sadly no real way to overcome this that I could find as there will always be log lines that get logged on the node before the network stack is fully up and running.

Something to keep in mind in case anyone wants to attempt something similar.

Work Around

kubernetes_sd_configs

Firstly I attempted to use the kubernetes_sd_configs with the node role. It will have the correct value for the host name so I wanted to use it. But it turns out you still need to provide a __path__ label. You can't make this __path__ the journal file because kubernetes_sd_configs is expecting a normal log file.

I also misunderstood how the kubernetes_sd_configs actually worked. I thought it was bottom up but it is in fact top down. Promtail Queries the API server, gets a list of nodes and then I need to "filter" that list using the __path__ variable as the source. I can't mix the labels form the journal with kubernetes_sd.

  kubernetes_sd_configs:
    - role: node
  journal:
  json: false
  max_age: 12h
  path: /var/log/journal
  labels:
    job: journal-test
relabel_configs:
  - action: replace
    source_labels:
      - __meta_kubernetes_node_name
    target_label: node_name

  - source_labels: 
      - '__journal__systemd_unit'
    target_label: unit

  - source_labels:
      - __journal__hostname
    target_label: host_name

  - source_labels:
      - __journal_syslog_identifier
    target_label: syslog_identifier

This would have been great because I wanted to use the node labels for the log output.

Hostname

I ended up using the ${HOSTNAME} in the default labels for the journal config in conjunction with -config.expand-env in the extra arguments (Thank you X)

Environment:

Question

Did I miss something or this just the way it works right now?

cstyan commented 1 year ago

@WesselAtWork sorry for the radio silence, is this still an issue for you? unfortunately I don't know much about journal here.

WesselAtWork commented 11 months ago

Not exactly an issue. It is working as intended but that causes some strange/unexpected behaviour

More an FYI and an a sanity check.

Don't know if there is a better place to put this, feel free to close it if it's clogging the issue queue

cstyan commented 11 months ago

Thanks for the reply :+1: I'll leave this open for now but unfortunately at the moment I have no better suggestions than what you're already doing.