I'm missing one (in my mind) pretty useful feature (option):
To skip over all the sparse files during the search for duplicates.
Use case:
In many cases, sparse files get pre-allocated, and only once they are completely written (downloaded, generated, ...) they are immutable and it's safe to deduplicate them. While they are still incomplete (sparse) they can't be safely considered the same file (even if at bit level they currently are). Reflinks are safe, but not supported by many filesystems. Skipping sparse files and using hardlinks/symlinks should be a relatively safe workaround.
I'm missing one (in my mind) pretty useful feature (option): To skip over all the sparse files during the search for duplicates.
Use case: In many cases, sparse files get pre-allocated, and only once they are completely written (downloaded, generated, ...) they are immutable and it's safe to deduplicate them. While they are still incomplete (sparse) they can't be safely considered the same file (even if at bit level they currently are). Reflinks are safe, but not supported by many filesystems. Skipping sparse files and using hardlinks/symlinks should be a relatively safe workaround.