pkolaczk / fclones

Efficient Duplicate File Finder
MIT License
1.83k stars 70 forks source link

isolate reports duplicates under same root if they also exist elsewhere #173

Closed felciano closed 1 year ago

felciano commented 1 year ago

According to the README docs, --isolate finds files that match across two directory trees, without matching identical files within each tree. However this doesn't seem to be the case. Consider this file structure, where all the files are identical:

dir-1/A.jpg
dir-1/A copy.jpg
dir-2/A.jpg

Then run the following:

fclones group dir-1 dir-2 --isolate

I would expect this to find duplicates of files in dir-1 in dir-2 only, and vice versa. Instead I get:

815e2d46660c7176848ad3900fb7a456, 1019282 B (1019.3 KB) * 3:
    /Volumes/Main/fclones/dir-1/A copy.JPG
    /Volumes/Main/fclones/dir-1/A.JPG
    /Volumes/Main/fclones/dir-2/A.JPG

The first two entries in this report indicate that A.JPG and A copy.JPG are duplicates, which is true, but should be excluded with the --isolate flag.

pkolaczk commented 1 year ago

This is not a bug. Works as designed. Fclones reports always both sides of the duplicate match, because it has no idea which of the duplicates you want to remove. If multiple files are present under one isolated roots, they are counted as one, but still all are reported.

felciano commented 1 year ago

@pkolaczk this makes sense -- thanks for the clarification.

In the isolate scenario, is there a way to tell which of the duplicates in the first directory will be used if you elect the link option?

That is, under the above scenario, if I tell fclones that I want to replace duplicates with links, will file .../dir-2/A.JPG end up being linked to .../dir-1/A.JPG or .../dir-1/A copy.JPG?