longhorn / longhorn

Cloud-Native distributed storage built on and for Kubernetes
https://longhorn.io
Apache License 2.0
6.03k stars 593 forks source link

[BUG] 40Gb of longhorn metadata in a pv? #8468

Closed simonebenati closed 5 months ago

simonebenati commented 5 months ago

Describe the bug

Unclear size of my PV , when it should be around 50gb it is 90gb.

To Reproduce

Import a database dump in a longhorn backed database workload and then perform some setup operations

Expected behavior

I expect the pv to be close to the size of the database not +30/+40Gb

Support bundle for troubleshooting

supportbundle_0e2fe36f-873b-4027-bc13-635cc8f1d41c_2024-04-29T07-40-29Z.zip

Environment

Additional context

Steps already tried to "shrink" the actual used size are:

tried to create a snapshot and then delete manually a snapshot without luck too..

I expect longhorn volume to be close to these values:

df -h
Filesystem                                              Size  Used Avail Use% Mounted on
overlay                                                  98G   13G   81G  14% /
tmpfs                                                    64M     0   64M   0% /dev
tmpfs                                                    16G     0   16G   0% /sys/fs/cgroup
/dev/mapper/ubuntu--vg-ubuntu--lv                        98G   13G   81G  14% /run
tmpfs                                                    28G   20K   28G   1% /dev/shm
tmpfs                                                    28G   36K   28G   1% /etc/app-secret
/dev/longhorn/pvc-6dc6bc75-bb64-4997-895c-eb259d7af8b8  688G   52G  637G   8% /var/lib/postgresql/data
du -sh /var/lib/postgresql/data/pgdata/*
4.0K    /var/lib/postgresql/data/pgdata/backup_label.old
11M     /var/lib/postgresql/data/pgdata/backup_manifest
51G     /var/lib/postgresql/data/pgdata/base
0       /var/lib/postgresql/data/pgdata/cnpg_initialized-postgresql-cluster-2
0       /var/lib/postgresql/data/pgdata/cnpg_initialized-postgresql-cluster-5
0       /var/lib/postgresql/data/pgdata/cnpg_initialized-postgresql-cluster-7
4.0K    /var/lib/postgresql/data/pgdata/current_logfiles
4.0K    /var/lib/postgresql/data/pgdata/custom.conf
4.2M    /var/lib/postgresql/data/pgdata/global
4.0K    /var/lib/postgresql/data/pgdata/override.conf
4.0K    /var/lib/postgresql/data/pgdata/pg_commit_ts
4.0K    /var/lib/postgresql/data/pgdata/pg_dynshmem
4.0K    /var/lib/postgresql/data/pgdata/pg_hba.conf
4.0K    /var/lib/postgresql/data/pgdata/pg_ident.conf
16K     /var/lib/postgresql/data/pgdata/pg_logical
28K     /var/lib/postgresql/data/pgdata/pg_multixact
4.0K    /var/lib/postgresql/data/pgdata/pg_notify
12K     /var/lib/postgresql/data/pgdata/pg_replslot
4.0K    /var/lib/postgresql/data/pgdata/pg_serial
4.0K    /var/lib/postgresql/data/pgdata/pg_snapshots
4.0K    /var/lib/postgresql/data/pgdata/pg_stat
24K     /var/lib/postgresql/data/pgdata/pg_stat_tmp
144K    /var/lib/postgresql/data/pgdata/pg_subtrans
4.0K    /var/lib/postgresql/data/pgdata/pg_tblspc
4.0K    /var/lib/postgresql/data/pgdata/pg_twophase
4.0K    /var/lib/postgresql/data/pgdata/PG_VERSION
737M    /var/lib/postgresql/data/pgdata/pg_wal
116K    /var/lib/postgresql/data/pgdata/pg_xact
4.0K    /var/lib/postgresql/data/pgdata/postgresql.auto.conf
28K     /var/lib/postgresql/data/pgdata/postgresql.conf
4.0K    /var/lib/postgresql/data/pgdata/postmaster.opts
4.0K    /var/lib/postgresql/data/pgdata/postmaster.pid
0       /var/lib/postgresql/data/pgdata/standby.signal
derekbit commented 5 months ago

Can you elaborate more on your question?

simonebenati commented 5 months ago

Hello @derekbit , The issue is:

I do not understand such discrepancy between the real usage of the pv , and the actual size displayed by longhorn. I believe 40Gb difference isn't normal. It isn't trimmable, it isn't an accumulating of snapshot issue. I don't understand what these 40gbs are, and how to get rid of them.. as you see from the previous message the real usage of the pv with a df -h within the pod is 52gb... Thank you

derekbit commented 5 months ago

For the actual size, please see the official document https://longhorn.io/docs/1.6.1/nodes-and-volumes/volumes/volume-size/

simonebenati commented 5 months ago

Yes I'm aware of that therefore I tried running trim filesystem multiple but it won't trim. And I do not understand why there is this 40Gb overhead.. that won't go away

derekbit commented 5 months ago

You have a 90 GiB snapshot. That's why your actual size is larger than 90 GiB.

    snapshots:
      c9e8de6c-d65f-43c0-8956-80f351567267:
        children:
          volume-head: true
        created: "2024-04-29T07:05:12Z"
        labels: {}
        name: c9e8de6c-d65f-43c0-8956-80f351567267
        parent: "null"
        removed: true
        size: "97226928128"
        usercreated: true
      volume-head:
        children: {}
        created: "2024-04-29T07:05:12Z"
        labels: {}
        name: volume-head
        parent: c9e8de6c-d65f-43c0-8956-80f351567267
        removed: false
        size: "319488"
        usercreated: false