Zygo / bees

Best-Effort Extent-Same, a btrfs dedupe agent
GNU General Public License v3.0
647 stars 55 forks source link

Oversize extents #147

Closed mithrandi closed 4 years ago

mithrandi commented 4 years ago

An old btrfs may have extents larger than 128 MiB due to bugs, or predating the extent size limit. If bees tries to deduplicate such an extent, it will throw: failed constraint check (src.size() < BLOCK_SIZE_MAX_TEMP_FILE)

Do you think that's worth a FAQ entry or some such?

mithrandi commented 4 years ago

Given how obscure this is, I'll just leave this issue for any future Googlers.

kakra commented 4 years ago

Maybe you could just manually rewrite such files? If it is a bug of earlier btrfs versions, you probably don't want those files stick around with larger extents?

Zygo commented 4 years ago

You can increase the size limits in bees and recompile. The limits are just there to prevent creation of multi-petabyte temporary files if there's junk in the metadata or a bug in the kernel ioctls. This used to be a problem with FIEMAP, but bees hasn't used FIEMAP in years. There's no technical reason why bees can't handle a 10GB extent if you have one.

I'll take this into consideration when doing the extent-based rewrite.