Open timwsuqld opened 7 months ago
I have the same issue. On a VM with 16G RAM, I have a 188GB /var/log/lastlog. During the test phrase we included /var/log/*log also. Promtail was killed by OOM.
Should there be some sort of failsafe to prevent promtail taking all the memory?
build user: root@21ab03f17324
build date: 2023-09-14T16:24:53Z
go version: go1.20.6
platform: linux/amd64
tags: promtail_journal_enabled```
We just ran into the same issue. Our config worked fine on ~30 VMs except for 3. They have a huge /var/log/lastlog size of 330G.
$ ls -lsah /var/log/lastlog
44K -rw-rw-r-- 1 root utmp 330G Apr 16 14:52 /var/log/lastlog
Interestingly, only Ubuntu 20.04 VMs are affected for us. Ubuntu 22.04 has a normal size of around ~288K.
Describe the bug Promtail consumes all RAM (doesn't start swapping) and causes the VM to freeze. OOM doesn't appear to kick in. This is caused when
/var/log/lastlog
ends up in the pattern match, which is a massively sparse file with almost no data in it. Promtail shouldn't consume all memory to read large files.To Reproduce Steps to reproduce the behavior:
/var/log/lastlog
as part of the included listExpected behavior Promtail should limit used memory (even at startup) so it can't consume everything on the machine causing it to crash. Yes, this can be avoided by excluding the
lastlog
file from being processed, but we should have limits on memory usage. I'm guessing we try to mmap the whole file?Environment:
Screenshots, Promtail config, or terminal output If applicable, add any output to help explain your problem.
config.yaml
docker-compose.yml
Du shows lastlog as small, while we can see if we use apparent-size lastlog is massive.
Yes, this can be mitigated with
__path_exclude__
, however we really shouldn't crash systems for config mistakes like this (and there are plenty of users where this exact config hasn't crashed their system, and then one day it will)