adrianlopezroche / fdupes

FDUPES is a program for identifying or deleting duplicate files residing within specified directories.
2.42k stars 186 forks source link

Search dupes for a specific file type/extension #6

Open angus73 opened 9 years ago

angus73 commented 9 years ago

Is there any way to search for duplicates only in a subset of files, matching for example a given extension? Thanks

rubenvarela commented 9 years ago

I would probably create a temporary folder and use find to filter the original files and generate hard links in the temp folder.

Something like this, (quick and untested),

cd ~/tmp_folder
find ~/source_folder '*.jpg' -exec ln {} . \;

Test it first in a small set of files.

thavelick commented 8 years ago

a quicker, one line, solution to this is:

fdupes -r . | grep -e '.js$' -e '^$' | uniq

Essentially, this removes any lines from the output of fdupes that either have the extension in question or are blank, then uses uniq to consolidate the blank lines.

angus73 commented 8 years ago

Thank you rubenvarela and thavelick ... I will surely give your hints a try (haven't dealt with this any more, actually)

EdwinKM commented 4 years ago

voting also for this option. include (or exclude) extensions to compare.

orizzle commented 4 months ago

a quicker, one line, solution to this is:

fdupes -r . | grep -e '.js$' -e '^$' | uniq

Essentially, this removes any lines from the output of fdupes that either have the extension in question or are blank, then uses uniq to consolidate the blank lines.

I like this solution. I cooked up a quick bash script to hard link by extension.

cat dupes.txt | grep -i -e '.wav$' -e '.ogg$' -e '.spr$' -e '.spr.gz$' -e '.dlt$' -e '.dlt.gz$' -e '.tga$' -e '.pcx$' -e '.tga.gz$' -e '^$' | while IFS= read -r line; do

if [ "$line" == "" ] then orig_file="" else

if [ "$orig_file" == "" ] then orig_file="$line" else ln -f "$orig_file" "$line" fi

fi

done

m-chaturvedi commented 1 week ago

This supports that feature: https://github.com/m-chaturvedi/undupes