grafana / loki

Like Prometheus, but for logs.
https://grafana.com/loki
GNU Affero General Public License v3.0
23.69k stars 3.42k forks source link

Loki crashing on docker #12948

Open imclint21 opened 5 months ago

imclint21 commented 5 months ago

Describe the bug The instance crash after few hours

Configuration:

auth_enabled: false

server:
  http_listen_port: 3100

common:
  instance_addr: 127.0.0.1
  path_prefix: /loki
  storage:
    filesystem:
      chunks_directory: /loki/chunks
      rules_directory: /loki/rules
  replication_factor: 1
  ring:
    kvstore:
      store: inmemory

storage_config:
  filesystem:
    directory: /loki/

schema_config:
  configs:
    - from: 2020-10-24
      store: tsdb
      object_store: filesystem
      schema: v13
      index:
        prefix: index_
        period: 24h

limits_config:
  max_query_lookback: 2100h
  retention_period: 2100h

compactor:
  working_directory: /loki/retention
  delete_request_store: filesystem
  retention_enabled: true

ruler:
  alertmanager_url: http://localhost:9093

analytics:
 reporting_enabled: false

Logs I get:

image
chaudum commented 5 months ago

Hi @imclint21 Thanks for reporting. Can you also post the full docker run command the the startup logs?

imclint21 commented 5 months ago

Here you go @chaudum I exported the whole logs. loki.csv

I restarted the instance for now it seems UP I'll wait a bit.

imclint21 commented 5 months ago

Okay no it again crashed, and this time I don't have logs.

The only things I can say it's it crash when I restart promtail on all my servers:

May 15 12:26:34 fra3 promtail[30227]: level=warn ts=2024-05-15T10:26:34.891870836Z caller=client.go:419 component=client host=loki.xxx msg="error sending batch, will retry" status=429 tenant= error="server returned HTTP status 429 Too Many Requests (429): Ingestion rate limit exceeded for user fake (limit: 4194304 bytes/sec) while attempting to ingest '4383' lines totaling '1048442' bytes, reduce log volume or contact your Loki administrator to see if the limit can be increased"
May 15 12:26:37 fra3 promtail[30227]: level=warn ts=2024-05-15T10:26:37.652537111Z caller=client.go:419 component=client host=loki.xxx msg="error sending batch, will retry" status=429 tenant= error="server returned HTTP status 429 Too Many Requests (429): Ingestion rate limit exceeded for user fake (limit: 4194304 bytes/sec) while attempting to ingest '4380' lines totaling '1048435' bytes, reduce log volume or contact your Loki administrator to see if the limit can be increased"
May 15 12:26:39 fra3 promtail[30227]: level=warn ts=2024-05-15T10:26:39.696488947Z caller=client.go:419 component=client host=loki.xxx msg="error sending batch, will retry" status=429 tenant= error="server returned HTTP status 429 Too Many Requests (429): Ingestion rate limit exceeded for user fake (limit: 4194304 bytes/sec) while attempting to ingest '4386' lines totaling '1048532' bytes, reduce log volume or contact your Loki administrator to see if the limit can be increased"
chaudum commented 5 months ago

After some testing I could not reproduce this issue yet.

imclint21 commented 5 months ago

I can propose you an access to our NAS if you want!

I guess it's related to compactor but I'm not sure, we have like 1GB currently in /loki.

imclint21 commented 4 months ago

Any news guys? I don't think I used a so special configuration, that's strange

chaudum commented 2 months ago

The only things I can say it's it crash when I restart promtail on all my servers:

~~Could it be that you are overloading Loki when you restart your Promtails? The 429 HTTP error indicates that you are hitting the ingestion rate limit. Have you checked the memory usage of your Docker container, maybe they are OOMing?~~

Edit: Nevermind, I saw that screenshot from panic again. This is definitely not from an OOM event. Is it always panicking?

imclint21 commented 2 months ago

Actually I'm not a specialist at all of Loki, I just try to make it works

chaudum commented 1 month ago

I can propose you an access to our NAS if you want!

You are running Loki as Docker container on a NAS? I don't think you can run Loki in a stable way in an environment that is usually very restricted in the amount of resources (CPU,memory).

imclint21 commented 1 month ago

It's not restricted at all.