grafana / loki

Like Prometheus, but for logs.
https://grafana.com/loki
GNU Affero General Public License v3.0
23.9k stars 3.45k forks source link

Promtail stops processing all but one journal file in systemd-journal-remote directory #2479

Closed rjgibson closed 3 years ago

rjgibson commented 4 years ago

Describe the bug I'm using systemd-journal-upload-> systemd-journal-remote to send journal files from a cluster of servers to /var/log/journal/remote on a single log server, one journal file per server. Loki initially shows log entries for all servers, but after a few minutes I only see log entries for a single server.

To Reproduce Steps to reproduce the behavior:

  1. Started Loki (1.5.0)
  2. Started Promtail (5714a9578735) to tail /var/log/remote/journal
  3. Queried {job="systemd-journal-remote"} (all log entries) and observed _HOSTNAME field. Loki displays log entries for numerous servers as expected.
  4. Waited a few minutes
  5. Queried {job="systemd-journal-remote"} (all log entries) and observed _HOSTNAME field. Loki displays log entries for only one server. Confirmed that journal files are still being updated.

Expected behavior Promtail continues to tail all journal files in remote journal directory, gracefully handling log rotation for each.

Environment:

Promtail config:

server:
  http_listen_port: 9080
  grpc_listen_port: 0

positions:
  filename: /tmp/positions.yaml

clients:
  - url: http://loki:3100/loki/api/v1/push

scrape_configs:
  - job_name: systemd-journal-remote
    journal:
      # Enable json format to get additional journal fields (e.g. _HOSTNAME)
      json: true
      max_age: 12h
      labels:
        job: systemd-journal-remote
      path: /var/log/journal/remote
    pipeline_stages:
      - match:
          selector: '{job="systemd-journal-remote"}'
          stages:
            - json:
                expressions:
                  host: _HOSTNAME
            - labels:
                # Add host as a label
                host:
    relabel_configs:
      - source_labels: ['__journal__systemd_unit']
        target_label: 'unit'
slim-bean commented 4 years ago

Apologies that this has been sitting, honestly it's going to be incredibly difficult for us to troubleshoot this as we don't have any setup like this.

But I do have a few questions, I'm confused on some aspects:

one journal file per server.

Yet your scrape config looks to only specify one journal entry and one file path: /var/log/journal/remote

Confirmed that journal files are still being updated.

Curious how you do this?

rjgibson commented 4 years ago

The server in question is acting as a centralized log server. In addition to its own systemd journal files, written to /var/log/journal//, it also receives and stores journal files from remote servers. These remote servers run systemd-journal-upload to send their journal files to the log server. The log server runs systemd-journal-remote to receive them and store them under /var/log/journal/remote/. The journal for each remote server is named remote-.journal, e.g. remote-172.29.236.21.journal. When rotated it's renamed to include a uuid, e.g. remote-172.29.236.21@035ab1ea439b4f41ab6cf7095b924eaa-00000000002ea942-0005abab7a2dd565.journal. There are currently 78 active (i.e. non-rotated) journal files in this directory.

I can see that the remote journal files under /var/log/journal/remote/ are being updated and rotated, and when I first start promtail I can see entries for all the remote servers in loki, but eventually I only see entries for a single remote server. I was hoping that promtail would monitor all files under /var/log/journal/remote/ for updates. A "normal" server contains multiple journal files under /var/log/journal/ so I'm not sure why /var/log/journal/remote/ would be handled any differently. Perhaps this use-case is different than the normal use-case, though. If so, this bug becomes a feature request.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

rjgibson commented 4 years ago

Bumping this issue to keep it alive. If it's not a bug, let's call it a feature request: It would be nice if promtail could track more than one log file in a directory. This would allow it to support the use case of centralized logs as created by systemd-journal-remote.

stale[bot] commented 4 years ago

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

rjgibson commented 4 years ago

Bump

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

slim-bean commented 3 years ago

Wanted to look at this, curious why this is happening. Surprised we haven't had other reports if this was widespread but regardless I'm still curious whats happening here

slim-bean commented 3 years ago

Ha, sorry, this isn't the issue I thought it was. I'm afraid this sounds like more of a feature request. I'll look at renaming the title to reflect the request and we can see if it gathers more interest. It may be hard to prioritize if there isn't a lot of interest, but I'm also not sure how hard it would be to implement

rjgibson commented 3 years ago

Thanks for taking another look. If there's not enough interest, or this looks like it will be more work than it's worth, I'll investigate running promtail on each node to send to loki directly rather than using systemd-journal-upload -> systemd-journal-remote.

stale[bot] commented 3 years ago

This issue has been automatically marked as stale because it has not had any activity in the past 30 days. It will be closed in 7 days if no further activity occurs. Thank you for your contributions.

ju6ge commented 8 months ago

This would be incredibly useful for me as well.

I am currently researching how to integrate systemd-remote-logs into grafana. Since having remote logging setup via systemd it means i do not need to add another executable to all my servers to push/pull the logs. They are already on the machine that is hosting grafana. It would be stupid to replicate them another time since that would take up a lot of disk space.

Hedius commented 7 months ago

same... would be nice to be able to process remote logs

Snektron commented 2 months ago

bump