koverstreet / bcachefs

Other
643 stars 71 forks source link

Reverse of `bcachefs data rereplicate`, and other attribute propagation #631

Open arduano opened 6 months ago

arduano commented 6 months ago

Hi! I've just been experimenting with bcachefs replication and the behavior of the filesystem in response to changing attributes.

In this case, I noticed that attributes only get applied to each file when the file is created, *except* for increasing replicas which can be applied with bcachefs data rereplicate afterwards. This includes attributes such as compression, reducing replicas, etc.

I couldn't find any bcachefs command that helps propagate attributes to existing files.

bcachefs version 1.3.5

arduano commented 6 months ago

It appears that manually re-creating the files doesn't apply the attributes either, I assume it de-duplicates the file automatically.

Some weird behavior I observed though:

This does nothing, appears to de-duplicate the file, preserving old attributes even when new ones are applied to the folder

$ cat /bcachefs/myfile > /bcachefs/myfile.copied

This doesn't deduplicate, and copies the file in full (applying attributes) each time. Even when there is an existing file already present with the same data, both with old and new attributes

$ cat /other_fs/the_same_file > /bcachefs/myfile.copied
arduano commented 6 months ago

Context:

I set up a dummy 4x4gb bcachefs drive, and tested applying compression, background_compression, replicas on various folders inside, and existing data was never affected, while any new data I'd copy in was.

I was monitoring the drive with bcachefs fs usage -h.

arduano commented 6 months ago

Possible explanation from the bcachefs IRC chat:

The rebalance thread isn't being triggered, and my drive's size is too small for bcachefs to notice the files (aka exceed io wait remaining) and spawn the thread. At the moment, there is no manual way to trigger the rebalance thread.

koverstreet commented 6 months ago

There's another wrinkle, which is that in the upstream version, rebalance only looks at the background_target and background_compression options. I just changed it to consider the compression option if the background_compression option isn't set; this is in my master branch.

We do want rebalance to consider other options like you're talking about, but we'll want to make sure we've thought through all the implications about generating potentially a ton of work like that - e.g. if the user flips the checksum option, do we really want to go through and redo the checksum option on everything, potentially making everything else wait?

arduano commented 6 months ago

@koverstreet would it be feasible to have a multi-operation priority queue kind of approach? Like having multiple filesystem tree walkers/iterators (not sure what your preferred terminology is), and then the rebalance dispatcher/daemon pulls from both queues evenly, or with a certain ratio (e.g. giving priority to writeback cache transfer, but running foreground compression rebalance as a lower priority too)