Cisco-Talos / clamav

ClamAV - Documentation is here: https://docs.clamav.net
https://www.clamav.net/
GNU General Public License v2.0
4.27k stars 692 forks source link

Clamav Kubernetes Pod stuck on automatic update of database #1229

Open ed-devops-d2 opened 5 months ago

ed-devops-d2 commented 5 months ago

Describe the bug

We are using the clamav docker image in a kubernetes pod, and every night that it attempts to automatically update the database, it gets stuck, to the point where we have created a cronjob that deletes the pod every night. When that happens also the RAM usage of the pod drops from 1.4GB to 14MB.

How to reproduce the problem

The issue seems to only appear when the image is used in Kubernetes. It's also important to mention that because in our environment privileged users are not allowed, we have modified the image as seen in the Dockerfile below:

USER root
RUN apk add --no-cache curl # if you need curl within the container
RUN mkdir -p /run/clamav /run/lock && \
    chown clamav:clamav /run/clamav /run/lock && \
    chmod 755 /run/clamav /run/lock
RUN mkdir -p /var/lock && \
    chown -R clamav:clamav /var/lock && \
    chmod -R 755 /var/lock
RUN mkdir /clamav_tmp && \
    chown -R clamav:clamav /clamav_tmp && \
    chmod -R 755 /clamav_tmp
COPY clamd.conf /etc/clamav/clamd.conf
RUN sed -i 's/#Checks 24/Checks 0/g' /etc/clamav/freshclam.conf
USER clamav

Config file: clamd.conf

LogFile = "/var/log/clamav/clamd.log" LogTime = "yes" PidFile = "/tmp/clamd.pid" TemporaryDirectory = "/clamav_tmp" LocalSocket = "/tmp/clamd.sock" TCPSocket = "3310" User = "clamav"

Config file: freshclam.conf

PidFile = "/tmp/freshclam.pid" UpdateLogFile = "/var/log/clamav/freshclam.log" Checks disabled DatabaseMirror = "database.clamav.net"

Config file: clamav-milter.conf

LogFile = "/var/log/clamav/milter.log" LogTime = "yes" PidFile = "/tmp/clamav-milter.pid" User = "clamav" ClamdSocket = "unix:/tmp/clamd.sock", "unix:/tmp/clamd.sock", "unix:/tmp/clamd.sock", "unix:/tmp/clamd.sock", "unix:/tmp/clamd.sock", "unix:/tmp/clamd.sock" MilterSocket = "inet:7357"

Software settings

Version: 1.2.1 Optional features supported: MEMPOOL AUTOIT_EA06 BZIP2 LIBXML2 PCRE2 ICONV JSON RAR

Database information

Database directory: /var/lib/clamav main.cvd: version 62, sigs: 6647427, built on Thu Sep 16 12:32:42 2021 daily.cld: version 27240, sigs: 2058527, built on Tue Apr 9 08:26:56 2024 bytecode.cld: version 335, sigs: 86, built on Tue Feb 27 15:37:24 2024 Total number of signatures: 8706040

Platform information

uname: Linux 4.18.0-425.13.1.el8_7.x86_64 #1 SMP Thu Feb 2 13:01:45 EST 2023 x86_64 OS: Linux, ARCH: x86_64, CPU: x86_64 zlib version: 1.3.1 (1.3.1), compile flags: a9 platform id: 0x0a21bfbf08000000000d0201

Build information

GNU C: 13.2.1 20231014 (13.2.1) sizeof(void*) = 8 Engine flevel: 191, dconf: 191

micahsnyder commented 5 months ago

@ed-devops-d2 The freshclam process typically load-tests the database and then causes clamd to reload the database. Both processes require temporarily using twice as much RAM as normal. It is likely that your pod does not have enough RAM for this and thus the process is killed / gets stuck.

ed-devops-d2 commented 5 months ago

The pod has a limit of 4GB which is the recommended, but this did not fix the issue.

vienleidl commented 3 months ago

When that happens also the RAM usage of the pod drops from 1.4GB to 14MB.

As I observed, the same behavior happened on Azure Container Apps with Workload Profile (hybrid Kubernetes), but I've never got this kind of issue on AKS or Azure Container Apps with the legacy ConsumptionOnly.