netdata / netdata

Architected for speed. Automated for easy. Monitoring and troubleshooting, transformed!
https://www.netdata.cloud
GNU General Public License v3.0
71.98k stars 5.93k forks source link

[Bug]: Netdata plugin systemd-journal eat disk space #17737

Closed CZAirwolf closed 5 months ago

CZAirwolf commented 5 months ago

Bug description

I have limited journal logs via /etc/systemd/journald.conf.d# cat override.conf

[Journal]
SystemMaxUse = 250M
SystemFileSize = 100M

Netdata don't honor this and eat system limited disk space.

/var/log/journal# du -h --max-depth 1
446M    ./54773980f42149b0885d4d65d3f1a923.netdata
262M    ./54773980f42149b0885d4d65d3f1a923
708M    .

/var/log/journal/54773980f42149b0885d4d65d3f1a923.netdata# ls -al
total 456708
drwxr-sr-x+ 2 root systemd-journal     4096 May 14 13:05 .
drwxr-sr-x+ 4 root systemd-journal     4096 Jan 25 10:20 ..
-rw-r-----+ 1 root systemd-journal 38709432 Feb  2 19:36 system@d7281dd8efd74c3ca36d095ec8da894e-0000000000000001-00060fc1b158769a.journal
-rw-r-----+ 1 root systemd-journal 38649792 Feb 13 21:56 system@d7281dd8efd74c3ca36d095ec8da894e-0000000000009a2e-0006106a64a86245.journal
-rw-r-----+ 1 root systemd-journal 38929208 Feb 23 15:54 system@d7281dd8efd74c3ca36d095ec8da894e-000000000001363d-000611499fd0c775.journal
-rw-r-----+ 1 root systemd-journal 38632296 Mar  7 13:37 system@d7281dd8efd74c3ca36d095ec8da894e-000000000001d3f0-0006120dbbd4b034.journal
-rw-r-----+ 1 root systemd-journal 38710712 Mar 20 10:10 system@d7281dd8efd74c3ca36d095ec8da894e-0000000000026f4d-000613115564aa4f.journal
-rw-r-----+ 1 root systemd-journal 38704632 Mar 28 11:09 system@d7281dd8efd74c3ca36d095ec8da894e-0000000000030afe-00061413f63cd3f2.journal
-rw-r-----+ 1 root systemd-journal 38673608 Apr  8 13:54 system@d7281dd8efd74c3ca36d095ec8da894e-000000000003a4a4-000614b5b76e5127.journal
-rw-r-----+ 1 root systemd-journal 38682520 Apr 16 10:09 system@d7281dd8efd74c3ca36d095ec8da894e-000000000004408e-0006159477db0658.journal
-rw-r-----+ 1 root systemd-journal 38696336 Apr 25 11:53 system@d7281dd8efd74c3ca36d095ec8da894e-000000000004dac5-00061632417284d5.journal
-rw-r-----+ 1 root systemd-journal 38712192 May  3 07:19 system@d7281dd8efd74c3ca36d095ec8da894e-0000000000057695-000616e8c3be0a87.journal
-rw-r-----+ 1 root systemd-journal 38550984 May 14 13:05 system@d7281dd8efd74c3ca36d095ec8da894e-000000000006116f-00061785dd5e5c97.journal
-rw-r-----+ 1 root systemd-journal 41943040 May 22 12:06 system.journal

I updated from 1.44.3 to 1.45.5 and don't expect any change, because nothing about limits in the plugin documentation.

Expected behavior

Ability do disable creating extra journal logs or limit max size/number with rotation.

Steps to reproduce

  1. install netdata with systemd-journal plugin
  2. wait
  3. check /var/log/journal ...

Installation method

manual setup of official DEB/RPM packages

System info

Linux gitlab-runner-02 6.1.0-18-amd64 #1 SMP PREEMPT_DYNAMIC Debian 6.1.76-1 (2024-02-01) x86_64 GNU/Linux
/etc/os-release:PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
/etc/os-release:NAME="Debian GNU/Linux"
/etc/os-release:VERSION_ID="12"
/etc/os-release:VERSION="12 (bookworm)"
/etc/os-release:VERSION_CODENAME=bookworm
/etc/os-release:ID=debian

Netdata build info

Packaging:
    Netdata Version ____________________________________________ : v1.45.5
    Installation Type __________________________________________ : binpkg-deb
    Package Architecture _______________________________________ : x86_64
    Package Distro _____________________________________________ :  
    Configure Options __________________________________________ : dummy-configure-command
Default Directories:
    User Configurations ________________________________________ : /etc/netdata
    Stock Configurations _______________________________________ : /usr/lib/netdata/conf.d
    Ephemeral Databases (metrics data, metadata) _______________ : /var/cache/netdata
    Permanent Databases ________________________________________ : /var/lib/netdata
    Plugins ____________________________________________________ : /usr/libexec/netdata/plugins.d
    Static Web Files ___________________________________________ : /var/lib/netdata/www
    Log Files __________________________________________________ : /var/log/netdata
    Lock Files _________________________________________________ : /var/lib/netdata/lock
    Home _______________________________________________________ : /var/lib/netdata
Operating System:
    Kernel _____________________________________________________ : Linux
    Kernel Version _____________________________________________ : 6.1.0-18-amd64
    Operating System ___________________________________________ : Debian GNU/Linux
    Operating System ID ________________________________________ : debian
    Operating System ID Like ___________________________________ : unknown
    Operating System Version ___________________________________ : 12 (bookworm)
    Operating System Version ID ________________________________ : none
    Detection __________________________________________________ : /etc/os-release
Hardware:
    CPU Cores __________________________________________________ : 4
    CPU Frequency ______________________________________________ : 3000000000
    RAM Bytes __________________________________________________ : 12541747200
    Disk Capacity ______________________________________________ : 171798691840
    CPU Architecture ___________________________________________ : x86_64
    Virtualization Technology __________________________________ : kvm
    Virtualization Detection ___________________________________ : systemd-detect-virt
Container:
    Container __________________________________________________ : none
    Container Detection ________________________________________ : systemd-detect-virt
    Container Orchestrator _____________________________________ : none
    Container Operating System _________________________________ : none
    Container Operating System ID ______________________________ : none
    Container Operating System ID Like _________________________ : none
    Container Operating System Version _________________________ : none
    Container Operating System Version ID ______________________ : none
    Container Operating System Detection _______________________ : none
Features:
    Built For __________________________________________________ : Linux
    Netdata Cloud ______________________________________________ : YES
    Health (trigger alerts and send notifications) _____________ : YES
    Streaming (stream metrics to parent Netdata servers) _______ : YES
    Back-filling (of higher database tiers) ____________________ : YES
    Replication (fill the gaps of parent Netdata servers) ______ : YES
    Streaming and Replication Compression ______________________ : YES (zstd lz4 gzip)
    Contexts (index all active and archived metrics) ___________ : YES
    Tiering (multiple dbs with different metrics resolution) ___ : YES (5)
    Machine Learning ___________________________________________ : YES
Database Engines:
    dbengine ___________________________________________________ : YES
    alloc ______________________________________________________ : YES
    ram ________________________________________________________ : YES
    none _______________________________________________________ : YES
Connectivity Capabilities:
    ACLK (Agent-Cloud Link: MQTT over WebSockets over TLS) _____ : YES
    static (Netdata internal web server) _______________________ : YES
    h2o (web server) ___________________________________________ : YES
    WebRTC (experimental) ______________________________________ : NO
    Native HTTPS (TLS Support) _________________________________ : YES
    TLS Host Verification ______________________________________ : YES
Libraries:
    LZ4 (extremely fast lossless compression algorithm) ________ : YES
    ZSTD (fast, lossless compression algorithm) ________________ : YES
    zlib (lossless data-compression library) ___________________ : YES
    Brotli (generic-purpose lossless compression algorithm) ____ : NO
    protobuf (platform-neutral data serialization protocol) ____ : YES (system)
    OpenSSL (cryptography) _____________________________________ : YES
    libdatachannel (stand-alone WebRTC data channels) __________ : NO
    JSON-C (lightweight JSON manipulation) _____________________ : YES
    libcap (Linux capabilities system operations) ______________ : NO
    libcrypto (cryptographic functions) ________________________ : YES
    libyaml (library for parsing and emitting YAML) ____________ : YES
Plugins:
    apps (monitor processes) ___________________________________ : YES
    cgroups (monitor containers and VMs) _______________________ : YES
    cgroup-network (associate interfaces to CGROUPS) ___________ : YES
    proc (monitor Linux systems) _______________________________ : YES
    tc (monitor Linux network QoS) _____________________________ : YES
    diskspace (monitor Linux mount points) _____________________ : YES
    freebsd (monitor FreeBSD systems) __________________________ : NO
    macos (monitor MacOS systems) ______________________________ : NO
    statsd (collect custom application metrics) ________________ : YES
    timex (check system clock synchronization) _________________ : YES
    idlejitter (check system latency and jitter) _______________ : YES
    bash (support shell data collection jobs - charts.d) _______ : YES
    debugfs (kernel debugging metrics) _________________________ : YES
    cups (monitor printers and print jobs) _____________________ : YES
    ebpf (monitor system calls) ________________________________ : YES
    freeipmi (monitor enterprise server H/W) ___________________ : YES
    nfacct (gather netfilter accounting) _______________________ : YES
    perf (collect kernel performance events) ___________________ : YES
    slabinfo (monitor kernel object caching) ___________________ : YES
    Xen ________________________________________________________ : YES
    Xen VBD Error Tracking _____________________________________ : NO
    Logs Management ____________________________________________ : YES
Exporters:
    AWS Kinesis ________________________________________________ : NO
    GCP PubSub _________________________________________________ : NO
    MongoDB ____________________________________________________ : YES
    Prometheus (OpenMetrics) Exporter __________________________ : YES
    Prometheus Remote Write ____________________________________ : YES
    Graphite ___________________________________________________ : YES
    Graphite HTTP / HTTPS ______________________________________ : YES
    JSON _______________________________________________________ : YES
    JSON HTTP / HTTPS __________________________________________ : YES
    OpenTSDB ___________________________________________________ : YES
    OpenTSDB HTTP / HTTPS ______________________________________ : YES
    All Metrics API ____________________________________________ : YES
    Shell (use metrics in shell scripts) _______________________ : YES
Debug/Developer Features:
    Trace All Netdata Allocations (with charts) ________________ : NO
    Developer Mode (more runtime checks, slower) _______________ : NO

Additional info

No response

ilyam8 commented 5 months ago

Hi, @CZAirwolf. This is the systemd journal, not Netdata. Looking at your configuration, I think it was applied to the default namespace and not the netdata namespace. Check out the systemd journal documentation - how to apply configuration to namespaces.

CZAirwolf commented 5 months ago

Installation of the netdata package from netdata.cloud will create /etc/logrotate.d/netdata for log handling. When the same package configure systemd journal for logging in the netdata namespace, why you want ignore those journal netdata logs???

image

ilyam8 commented 5 months ago

The logrotate file has nothing to do with systemd journal logs. systemd-journald (not Netdata) is responsible for rotating. As I mentioned above, your configuration is only for the default systemd journal namespace.

CZAirwolf commented 5 months ago

Please remove /etc/logrotate.d/netdata, it's not netdata problem, rsyslog (etc) is responsible for rotating.

Really stupid answer.

If you create dedicated logs (especially not covered by default settings), you are responsible for ensuring the rotation.

ilyam8 commented 5 months ago

You are confusing text log files (/var/log/) and systemd journal binary logs (/var/log/journal). logrotate rotates and compresses log files (/var/log/netdata/*), but Netdata does not write to files when running as a systemd service.

systemd-journald (systemd's logging daemon) is writing/rotating /var/log/netdata/*.


I have limited journal logs via /etc/systemd/journald.conf.d# cat override.con

from man journald.conf

The systemd-journald instance managing the default namespace is configured by /etc/systemd/journald.conf and associated drop-ins.
Instances managing other namespaces read /etc/systemd/journald@NAMESPACE.conf and associated drop-ins with the namespace identifier filled in.

^^ That is why I said that

Looking at your configuration, I think it was applied to the default namespace and not the netdata namespace

ilyam8 commented 5 months ago

@Ferroin, hey. The default limit for journal files (SystemMaxUse/RuntimeMaxUse) is 4GB. I think it makes sense to install /etc/systemd/journald@netdata.conf with smaller values. What do you think?