adrianlopezroche / fdupes

FDUPES is a program for identifying or deleting duplicate files residing within specified directories.
2.42k stars 186 forks source link

Documentation Confusion with -H #140

Open badlandz opened 4 years ago

badlandz commented 4 years ago

I'm confused when I read the documentation for -H which in my version says:

-H --hardlinks
              normally, when two or more files point to the same disk area they are treated as non-duplicates; this option will change this behavior

What I would like to do is find two identical pieces of data on two different disk areas, and link the file names to the same single area of data and eliminate the duplicate data that is wasting disk space.

I previously thought this option accomplished this goal. I thought I read somewhere it would. But the man page explanation doesn't seem to say it will. It looks like it will find two file names that point to the same data on the disk (a hard link) and create a copy of the data, so each file name points to a separate copy of the data.

Is it an issue of wording in the man page? What happens with -H? And how do I do what I'm trying to do?

jbruchon commented 4 years ago

That option will cause files that are ALREADY hard-linked to be considered duplicates for printing/deleting/etc. As far as I can tell, the program does not yet have an option that creates hard links.

@adrianlopezroche I was just thinking: what if you tacked on something to the deletion options that would delete the target and replace with a hard link rather than just deleting? That would be relatively easy to add and would reuse all of the existing deletion code. Perhaps if a number at the old-style prompt starts with "L", it means keep as a hard link?

badlandz commented 4 years ago

@jbruchon Thanks, glad I read the man page before running it! I think -L is what I was looking for, the man page description seems right, according to my man page and an article I found at https://easyengine.io/tutorials/linux/fdupes-duplicate-hardlinks/

That sounds like what you are proposing to @adrianlopezroche so maybe it's already there? The only outstanding question I have would be, two file systems. I don't know if hardlinks will reliably act if the subdirectory spans two file systems (I have some stuff in XFS and some in a ZFS pool). I'm unsure what the -L would do in that situation.

jbruchon commented 4 years ago

-L specifies a size limit. There was a patch in Debian against fdupes-1.51 that added a -L option which performs hard linking, but it never made it into mainline fdupes.

jvacek commented 4 years ago

@badlandz that article references a fork of this project https://github.com/tobiasschulz/fdupes