adrianlopezroche / fdupes

FDUPES is a program for identifying or deleting duplicate files residing within specified directories.
2.42k stars 186 forks source link

RFE: tell fdupes to always prefer a certain directory #132

Open macau23 opened 4 years ago

macau23 commented 4 years ago

I'd like to be able to tell fdupes to always prefer to preserve files within a given directory if duplicates are found.. At the moment the order of presented duplicates is not deterministic and it means that deleting duplicate files takes a lot longer:

$ fdupes -r --delete .
[1] ./aaa/1.txt
[2] ./bbb/1.txt
Set 1 of 2, preserve files [1 - 2, all]: x

[1] ./bbb/2.txt
[2] ./aaa/2.txt
Set 1 of 2, preserve files [1 - 2, all]: x

After the suggestion:

$ fdupes -r --prefer ./aaa/ --delete .
[1] ./aaa/1.txt
[2] ./bbb/1.txt
Set 1 of 2, preserve files [1 - 2, all]: x

[1] ./aaa/2.txt
[2] ./bbb/2.txt
Set 1 of 2, preserve files [1 - 2, all]: x

This enables me to use --noprompt without losing files from the wrong directory.

sandrotosi commented 4 years ago

I'd like to see this feature too!

da-sti commented 3 years ago

very needed feature to automatically delete all the doubles of pictures imported and imported again from and to different devices

bjhartin commented 2 years ago

Can PR #144 be reviewed and merged?

This is a very needed feature for situations where directory A has some of the same files as B, but not in the same structure.

Happened to me when trying to merge a relative's photo library with mine. They had a lot of my pics but had rearranged them.

I want to run fdupes -dNr -keep=A A B

Friday13th87 commented 1 year ago

jdupes, a fdupes fork, is doing that with "-O" As this feature is requested since years i assume that it will never come to fdupes

jbruchon commented 1 year ago

jdupes, a fdupes fork, is doing that with "-O"

I'm the author of jdupes. No, it does not. That's the parameter order priority flag and only controls sorting.

Friday13th87 commented 1 year ago

jdupes, a fdupes fork, is doing that with "-O"

I'm the author of jdupes. No, it does not. That's the parameter order priority flag and only controls sorting.

the order priority_flag is controlling the sorting, meaning you can say that duplicates should rather be deleted in dir2 then in dir1 for example. ok, so far so good.

the question of the topic was:

"tell fdupes to always prefer a certain directory"

The ability to set priorities is exactly doing that "prefer a directory" Meaning: jdupes -rNdO dir1/ dir2/ is setting the preserve priority to dir1, so dir1 is first and duplicates will be deleted in dir2 rather then in dir1, and that was the question.

My answer was correct in every way.

jbruchon commented 1 year ago

No, it's not. Your ego is not in question here; your correctness is. The parameter order controls the sorting, not the "preserve priority." Deletions will gladly nuke items in the first directory specified. It'll delete files in dir1 all day long. The request was to "always prefer to preserve files within a given directory." -O will (probably) preserve the first file in dir1 but all the rest of the files in dir1 in the set will be deleted. Your answer is only correct for the simplistic example in the original post. Most data sets are not nearly so simple. "The parameter order flag will 'always prefer to preserve files within a given directory'" is a false statement.

I will not entertain further discussion on this. You can't tell me I don't know how the program I wrote works.

Friday13th87 commented 1 year ago

No, it's not. Your ego is not in question here; your correctness is. The parameter order controls the sorting, not the "preserve priority." Deletions will gladly nuke items in the first directory specified. It'll delete files in dir1 all day long. The request was to "always prefer to preserve files within a given directory." -O will (probably) preserve the first file in dir1 but all the rest of the files in dir1 in the set will be deleted. Your answer is only correct for the simplistic example in the original post. Most data sets are not nearly so simple. "The parameter order flag will 'always prefer to preserve files within a given directory'" is a false statement.

I will not entertain further discussion on this. You can't tell me I don't know how the program I wrote works.

No, sorry you are not right andthis is not about my ego, its about your ego sadly, i just wanted to help and you aredoing exactly the opposite to proof i-dont-know-what.

The initial poster was searching for a solution to prefer one driectory over another, which means "if possible delete from directory x and not from y, if its not possible do what you have to do" ṕrefering one directory over another doesnt mean that the initial poster was searching a solution to prohibit deletions from one directory, just prefering to delete from one directory that if there is a duplicate in both dirs it will be left at one specific dir. jdupe dir1/ dir2/ is doing this.

and for @bjhartin with jdupes you can do as you wish easily with:

chmod 555 -R dir1/ [--> jdupes cant delete files here, but calculate hashes etc.]
jdupes -rNdO dir1/ dir2/
chmod 755 -R dir1/ [or whatever privilegs you like to give the folder]

i hope that helped.

JohnCrafton commented 1 year ago

No, it's not. Your ego is not in question here; your correctness is.

@Friday13th87 You're being really unhelpful. The fellow said he's the author of jdupes; shut it down. Whether you think you're right no longer matters.

You're giving advice you claim as authoritative when the author of the program refuted you.

To anyone visiting this thread (and likely any others with this Friday person): caveat emptor.

I came looking for a way to do this thing, too, incidentally. I'd love a way to --prefer /some/arbitrary/master/path in one of these tools.

I suppose it's back to setting the "master" as read-only and running fdupes to see if it blows up.

jbruchon commented 1 year ago

The fellow said he's the author of jdupes; shut it down. Whether you think you're right no longer matters.

To be fair: I didn't write every piece of code in jdupes and it's entirely possible to trip over my own human errors. The code behind -O, however, I personally wrote and tested. I know exactly what it does and there's a good chance I don't have dementia (yet). Fortunately, I can be completely mentally broken and anyone can still see exactly how it works.

jbruchon commented 1 year ago

@JohnCrafton you might find the example scripts in the jdupes code base to be useful. I recognized that many people want to perform custom actions that the core program doesn't handle, so I wrote some template/example shell scripts that can be modified to suit your needs. They should also be able to use fdupes instead of jdupes as long as you check the options passed to the program. The output format is the same (duplicate items one per line with an empty line between duplicate sets). You can use grep to match a substring and decide to not act upon a specific directory or file, for example.

adrianlopezroche commented 1 year ago

I came looking for a way to do this thing, too, incidentally. I'd love a way to --prefer /some/arbitrary/master/path in one of these tools.

I suppose it's back to setting the "master" as read-only and running fdupes to see if it blows up.

If you don't need it to run unsupervised (via -N) then you can use the new fdupes interactive mode to do this:

selb /some/arbitrary/master/path isel ds prune

The first command will select every file in your "master" path, the second will deselect those and instead select their duplicates, the third will mark the now selected ones for deletion, and the last one will delete them.

101Dude commented 8 months ago

Been awhile. Somewhere I saw this recommended for choosing which to delete:

fdupes -r dir1 dir2|grep dir1/|xargs rm

I can't get that to work on macOS, and I am sure someone here can suggest why. This is an alternative method of getting what you want.

macau23 commented 7 months ago

@101Dude you should not use rm with xargs, it will do the wrong thing with spaces or files that need quoting.

101Dude commented 7 months ago

@macau23 this is what I ended up using and it works well.

xargs runs into issues when path names have special characters. fdupes doesn't have a -print0 option like find does - it trips up.

The following command results in an error because of a single quote in a filename:

fdupes -r dir1 dir2|grep dir1/|xargs rm

xargs: unterminated quote

The UNIX way around this is to add another command between the grep and xarg commands:

... | tr '\n' '\0' | xargs -0 -n1 ...

This addition comes from an excellent explanation at Make xargs execute the command once for each line of input

The full command would then be:

fdupes -r dir1 dir2 |grep "dir2/" |tr '\n' '\0' |xargs -0 -n1 rm -v

Check this command first using echo or another non-destructive command before using rm. Adding the -v option allows you to see what has been removed.

An example of a non-destructive option is to use the tag command (install with homebrew). Add a red Finder tag to files that are duplicates so you can manually select and drag to trash :)

fdupes -r dir1 dir2 | grep "dir2" | tr '\n' '\0' | xargs -0 -n1 -I % tag -a red %

sylvainsab commented 4 months ago

same request

VD171 commented 3 months ago

Any solution?

skitchin commented 1 month ago

Here's how you can delete duplicates without removing files from a specific directory. You'll use the -o option with double slashes with full path to set the priority order. For example:

fdupes -rdN -o name //pictures/photo1 /pictures/photo2

This command will delete duplicates found in the photo2 directory, keeping the files in photo1.

If you have three or more directories, add slashes in the order of priority. For instance, with four directories:

fdupes -rdN -o name ////pictures/photo1 ///pictures/photo2 //pictures/photo3 /pictures/photo4

This setup ensures that any duplicates found in photo1 and photo2 will be deleted from photo2. Similarly, duplicates found in photo2 and photo4 will be deleted from photo4.

I hope this helps.