kdave / btrfs-progs

Development of userspace BTRFS tools
GNU General Public License v2.0
527 stars 239 forks source link

Does current version of btrfs support Host-Managed Garbage Collection on Zone namespace ssd? #768

Open sg20180546 opened 3 months ago

sg20180546 commented 3 months ago

hello I am Zone Namespace ssd researcher from south korea.

i have one question about btrfs but i cannot found way to ask btrfs maintainer.

so my question is,

does latest version of btrfs implement Host Manage Garbage collection ?

If i repeat write intensive filebench or rocksdb fillrandom and delete every files in mount directory, btrfs emits error message for 'BTRFS: error (device nvme0n2: state A) in btrfs_del_csums:1014: errno=-28 No space left'.

but files are all deleted, invalidate space should be zone-reset and reclaimed.

however, as my knowledge, there's no codeline about copy operations for host managed gc. is it not supported? or i'm missing some important facts.

thanks.

kdave commented 3 months ago

i have one question about btrfs but i cannot found way to ask btrfs maintainer.

It's OK to ask here.

Yes garbat collection is supported, it's not called like that though. In code you could find references to zone_unusable and block group reclaim. The main function that does all the work is btrfs_reclaim_bgs_work().

but files are all deleted, invalidate space should be zone-reset and reclaimed.

The workload you describe could fill the drives too fast and the reclaim does not keep up, so it ends up as ENOSPC at some point. You'd need to watch how much unreclaimable space there is (eg. in btrfs fi df), either wait until it's gone or run btrfs filesystem balance to reclaim the space faster (although for now there ar no specific filters for the unusable space).

however, as my knowledge, there's no codeline about copy operations for host managed gc. is it not supported? or i'm missing some important facts.

I'm not sure if I understand what you mean by copy operations and gc, it still sounds like what the reclaim process does. Block groups that have more unusable space than a threshold (configurable in sysfs) are put to a list and then reclaimed. This is done using the relocation mechanism that does copy the old data to new location and then updates pointers.

* i am using kernel 6.4 and WD ZN540

The zoned support is still improving in each release so you could also verify the behaviour on newer kernels, 6.8 or even the latest 6.9-rcX. There was a recent fix https://git.kernel.org/linus/a8b70c7f8600bc77d03c0b032c0662259b9e615e that could be related to what you see.

kdave commented 3 months ago

Note for documentation: add section about the reclaim/gc behaviour, it's missing.

naota commented 2 months ago

Hello @sg20180546 I'm sorry I missed your email several days ago.

As David said, btrfs_reclaim_bgs_work() reclaims existing BGs that still have some data in them.

Also, when a block group is fully unused, that block group is reset in btrfs_finish_extent_commit().

Regarding the early ENOSPC issue, there is an on-going patch and discussion about that. In short, btrfs does the reclaiming, but it might not be fast enough.

https://lore.kernel.org/linux-btrfs/20240328-hans-v1-0-4cd558959407@kernel.org/