Hi - I'm new to using fclones and have found it to be a huge improvement over the other duplicate file finders out there. However, I'm having a hard time making sense of the --unique flag. For example,
mkdir foo
echo a > foo/a
echo b > foo/b
echo c > foo/c
echo c > foo/copy_of_c
Output of fclones group --unique foo:
[2022-08-25 16:03:47.382] fclones: info: Started grouping
[2022-08-25 16:03:47.384] fclones: info: Scanned 5 file entries
[2022-08-25 16:03:47.384] fclones: info: Found 4 (8 B) files matching selection criteria
[2022-08-25 16:03:47.384] fclones: info: Found 0 (0 B) candidates after grouping by size
[2022-08-25 16:03:47.384] fclones: info: Found 0 (0 B) candidates after grouping by paths
[2022-08-25 16:03:47.386] fclones: info: Found 2 (4 B) candidates after grouping by prefix
[2022-08-25 16:03:47.386] fclones: info: Found 2 (4 B) candidates after grouping by suffix
[2022-08-25 16:03:47.386] fclones: info: Found 2 (4 B) unique files
# Report by fclones 0.27.0
# Timestamp: 2022-08-25 16:03:47.386 -0400
# Command: fclones group --unique foo
# Base dir: /home/dan
# Total: 8 B (8 B) in 4 files in 3 groups
# Redundant: 0 B (0 B) in 0 files
# Missing: 4 B (4 B) in 2 files
6f973377854c3f70db84707e1de8d1a0, 2 B (2 B) * 1:
/home/dan/foo/a
57f77e37a6de146f34541732cef23436, 2 B (2 B) * 2:
/home/dan/foo/c
/home/dan/foo/copy_of_c
13385bf32d48b5c03331333a6a16c7bd, 2 B (2 B) * 1:
/home/dan/foo/b
I'm surprised to be seeing c and copy_of_c at all. The csv format makes it easiest to distinguish the difference because of the file count column:
[2022-08-25 16:04:51.621] fclones: info: Started grouping
[2022-08-25 16:04:51.622] fclones: info: Scanned 5 file entries
[2022-08-25 16:04:51.622] fclones: info: Found 4 (8 B) files matching selection criteria
[2022-08-25 16:04:51.623] fclones: info: Found 0 (0 B) candidates after grouping by size
[2022-08-25 16:04:51.623] fclones: info: Found 0 (0 B) candidates after grouping by paths
[2022-08-25 16:04:51.628] fclones: info: Found 2 (4 B) candidates after grouping by prefix
[2022-08-25 16:04:51.628] fclones: info: Found 2 (4 B) candidates after grouping by suffix
[2022-08-25 16:04:51.628] fclones: info: Found 2 (4 B) unique files
size,hash,count,files
2,6f973377854c3f70db84707e1de8d1a0,1,/home/dan/foo/a
2,57f77e37a6de146f34541732cef23436,2,/home/dan/foo/c,/home/dan/foo/copy_of_c
2,13385bf32d48b5c03331333a6a16c7bd,1,/home/dan/foo/b
Though it's still dependent on me doing a filter of the output. This is complicated by the CSV not escaping the commas, so typical CLI tools consider it an invalid CSV (would you accept a PR quoting the files column?).
Is it expected to display non-unique files in the output of group --unique? I had expected it to only produce groups of files of size 1, the inverse of the normal behavior.
Hi - I'm new to using fclones and have found it to be a huge improvement over the other duplicate file finders out there. However, I'm having a hard time making sense of the
--unique
flag. For example,Output of
fclones group --unique foo
:I'm surprised to be seeing
c
andcopy_of_c
at all. The csv format makes it easiest to distinguish the difference because of the file count column:Though it's still dependent on me doing a filter of the output. This is complicated by the CSV not escaping the commas, so typical CLI tools consider it an invalid CSV (would you accept a PR quoting the files column?).
Is it expected to display non-unique files in the output of
group --unique
? I had expected it to only produce groups of files of size 1, the inverse of the normal behavior.Thanks!