Open nbisson opened 1 year ago
We are experiencing a similar issue (file watcher event has been logged by promtail for large log files, pointer in positions.yaml
is 0
for large log files, Grafana does not show anything).
Environment:
/etc/docker/daemon.json
: not configured - default json-file
driver with unlimited logging is usedLoki's values.yaml
:
loki:
auth_enabled: false
analytics:
reporting_enabled: false
commonConfig:
replication_factor: 1
storage:
type: 'filesystem'
server:
log_level: warn
grpc_server_max_concurrent_streams: 0
grpc_server_max_recv_msg_size: 8388608
rulerConfig:
storage:
type: 'local'
local:
directory: /rules
rule_path: /tmp/
alertmanager_url: http://prometheus-alertmanager.telemetry.svc.cluster.local:9093
ring:
kvstore:
store: inmemory
enable_api: true
structuredConfig:
compactor: #https://grafana.com/docs/loki/latest/upgrading/#240
retention_enabled: true
limits_config:
retention_period: 14d
ingestion_rate_mb: 10
ingestion_burst_size_mb: 25
max_streams_per_user: 0
max_global_streams_per_user: 0
per_stream_rate_limit: 10MB
per_stream_rate_limit_burst: 50MB
chunk_store_config:
max_look_back_period: 336h
monitoring:
dashboards:
enabled: false
namespace: grafana
rules:
enabled: false
alerting: false
alerts:
enabled: false
serviceMonitor:
enabled: false
selfMonitoring:
enabled: true
grafanaAgent:
installOperator: true
lokiCanary:
enabled: false
test:
enabled: false
Promtail's values.yaml
:
serviceMonitor:
enabled: false
extraVolumes:
- name: journal
hostPath:
path: /var/log/journal
extraVolumeMounts:
- name: journal
mountPath: /var/log/journal
readOnly: true
tolerations:
- key: node-role.kubernetes.io/control-plane
operator: Exists
effect: NoSchedule
- key: node-role.kubernetes.io/etcd
operator: Exists
effect: NoExecute
config:
clients:
- url: http://loki.loki.svc.cluster.local:3100/loki/api/v1/push
tenant_id: 1
external_labels:
cluster: cl1
snippets:
extraScrapeConfigs: |
- job_name: journal
journal:
max_age: 12h
labels:
job: systemd-journal
path: /var/log/journal
relabel_configs:
- source_labels:
- __journal__systemd_unit
target_label: systemd_unit
- source_labels:
- __journal__hostname
target_label: node
- source_labels:
- __journal_syslog_identifier
target_label: syslog_identifier
pipelineStages:
- docker: { }
# ...
Additional clues:
/loki/api/v1/series
endpoint does no longer show the series belonging to the large log file.Suspects:
/cc @thomashoell
I also encountered the same issue.
kubernetes 1.15 installed using kubeadm on ubuntu 22.04
loki and promtail installed using grafana/loki-stack helm charts
loki 2.8.2
promtail 2.8.2
loki and promtail values.yaml
:
loki:
config:
auth_enabled: false
compactor:
shared_store: filesystem
working_directory: /data/loki/boltdb-shipper-compactor
ingester:
lifecycler:
ring:
kvstore:
store: inmemory
replication_factor: 1
wal:
dir: /data/loki/wal
limits_config:
enforce_metric_name: false
ingestion_burst_size_mb: 100
ingestion_rate_mb: 50
per_stream_rate_limit: 5MB
per_stream_rate_limit_burst: 20MB
reject_old_samples: true
reject_old_samples_max_age: 168h
schema_config:
configs:
- from: "2020-10-24"
index:
period: 24h
prefix: index_
object_store: filesystem
schema: v11
store: boltdb-shipper
server:
grpc_listen_port: 9095
http_listen_port: 3100
storage_config:
boltdb_shipper:
active_index_directory: /data/loki/boltdb-shipper-active
cache_location: /data/loki/boltdb-shipper-cache
cache_ttl: 24h
shared_store: filesystem
filesystem:
directory: /data/loki/chunks
promtail:
config:
file: |
server:
http_listen_port: 3101
log_level: info
clients:
- url: http://loki:3100/loki/api/v1/push
positions:
filename: /run/promtail/positions.yaml
scrape_configs:
- job_name: kubernetes-pod-label-app
pipeline_stages:
- docker: {}
kubernetes_sd_configs:
- role: pod
relabel_configs:
- source_labels:
- __meta_kubernetes_pod_label_app
regex: ''
action: drop
- source_labels:
- __meta_kubernetes_pod_node_name
target_label: __host__
- source_labels:
- __meta_kubernetes_namespace
target_label: namespace
- source_labels:
- __meta_kubernetes_pod_name
target_label: pod
- source_labels:
- __meta_kubernetes_pod_container_name
target_label: container
- source_labels:
- __meta_kubernetes_namespace
- __meta_kubernetes_pod_label_app
separator: /
target_label: job
- regex: __meta_kubernetes_pod_label_(.+)
action: labelmap
- source_labels:
- __meta_kubernetes_pod_uid
- __meta_kubernetes_pod_container_name
separator: /
replacement: /var/log/pods/*$1/*.log
target_label: __path__
docker daemon.json
:
{
"registry-mirrors": ["https://docker.mirrors.ustc.edu.cn"]
}
positions.yaml:
> cat /run/promtail/positions.yaml | grep renderelf | grep nginx
/var/log/pods/renderelf_renderelf-nginx-65cc66c854-dz7xn_627486a5-f548-4349-8f3b-887e3bb6d1b2/renderelf-nginx/0.log: "0"
promtail pod logs:
kubectl -n monitoring logs -f loki-promtail-689xv | grep -B 20 -A 20 "Re-open"
level=info ts=2023-07-03T06:15:41.007774517Z caller=filetargetmanager.go:179 msg="received file watcher event" name=/var/log/pods/renderelf_renderelf-frontend-bf89b465c-qwg79_6379d476-a439-499f-99d8-4a1c442349e9/renderelf-frontend/0.log op=CREATE
level=info ts=2023-07-03T06:15:41.028558282Z caller=filetargetmanager.go:179 msg="received file watcher event" name=/var/log/pods/renderelf_renderelf-nginx-65cc66c854-dz7xn_627486a5-f548-4349-8f3b-887e3bb6d1b2/renderelf-nginx/0.log.20230703-141541 op=CREATE
ts=2023-07-03T06:15:41.028686051Z caller=log.go:168 level=info msg="Re-opening moved/deleted file /var/log/pods/renderelf_renderelf-nginx-65cc66c854-dz7xn_627486a5-f548-4349-8f3b-887e3bb6d1b2/renderelf-nginx/0.log ..."
level=info ts=2023-07-03T06:15:41.028695308Z caller=filetargetmanager.go:179 msg="received file watcher event" name=/var/log/pods/renderelf_renderelf-nginx-65cc66c854-dz7xn_627486a5-f548-4349-8f3b-887e3bb6d1b2/renderelf-nginx/0.log op=CREATE
level=info ts=2023-07-03T06:15:51.038185803Z caller=filetargetmanager.go:179 msg="received file watcher event" name=/var/log/pods/renderelf_renderelf-server-worker1-854bb99fdd-x9g9w_d4b461ca-cf6a-4bfa-b442-697a2ada2db9/renderelf-server-worker1/0.log.20230703-141551 op=CREATE
level=info ts=2023-07-03T06:15:51.038362113Z caller=filetargetmanager.go:179 msg="received file watcher event" name=/var/log/pods/renderelf_renderelf-server-worker1-854bb99fdd-x9g9w_d4b461ca-cf6a-4bfa-b442-697a2ada2db9/renderelf-server-worker1/0.log op=CREATE
I solved my problem https://github.com/grafana/loki/issues/9864#issuecomment-1631894467.
Hello,
Describe the bug Promtail received file watcher event but positions do not increase after a few hours of working. From grafana we are experiencing times without logs
Promtail logs :
Promtail position file :
To Reproduce
Install loki helm chart 5.0.0 on RKE cluster (CRI docker). values.yaml
Install promtail helm chart 6.10.0 values.yaml
resources: requests: cpu: 100m memory: 128Mi
defaultVolumes:
defaultVolumeMounts:
extraVolumes:
extraVolumeMounts:
Expected behavior The logs should be sent in real time and the position file should be increased accordingly.
Environment:
Screenshots, Promtail config, or terminal output
Docker is installed in other partition (
/data/docker
).I do not understand if it is a problem of limits, rotate logs, throughput or other. The stack works some hours after pod restarting but not after.
Thanks for your help.