Closed slrslr closed 2 months ago
Investigate the --program
, --script
and -i
options.
The --program
option takes the full path to a program as an argument.
So, to call gwenview
on each group of files, you can do something like:
findimagedupes ... -p "$(type -p gwenview)" ...
Alternatively, you can use the --script
option to write the output to a file.
Then edit the VIEW
shell function to do more complicated actions.
The default definition is:
VIEW(){
echo "$@"
}
You could edit that to become:
VIEW)(){
# remove first file
echo rm "$1"
shift
# open the rest
gwenview "$@"
}
You can also supply overrides using -i
to have the customised script run automatically without having to save and edit manually:
findimagedupes ... -i 'VIEW(){ gwenview "$@"; }' ...
The VIEW
function can perform arbitrary actions on the group of files, including deletion. But be careful! It is easy to accidentally delete something you did not intend to. I suggest moving or renaming files as a first step, rather than immediately deleting anything.
Thank you for a detailed explanation. It helped to achieve wanted outcome. the -p or -i i have placed BEFORE ending switches (-a):
run all duplicates in a program:
-p "$(type -p gwenview)"
run duplicates in a program echo (so it just prints the paths):
-p "$(type -p echo)"
echo 1st of the duplicates and open the rest in gwenview:
-i 'VIEW(){ echo "rm $1"; gwenview "$@"; }'
remove first of the duplicate files and view the rest:
-i 'VIEW(){ rm "$1"; gwenview "$@"; }'
remove first of the duplicate files (risky, read below):
-i 'VIEW(){ echo "Removing first of the duplicates."; rm "$1"; }'
report that there is a duplicate file and place 1st of the duplicates into a certain folder:
-i 'VIEW(){ echo "Similar file found. Check _duplicates_to_delete folder."; mv "$1" "/_duplicates_to_delete/" 2>/dev/null; }'
I am using $1 since the first of the duplicate paths seems to be always the one from the folder that I am supplying to the findimagedupes command using switch -a -- "$folderwithduplicates" and this folder is the only one in which I want to remove duplicates. So I assume that it is safe to always remove $1 without prompt in my case (I am using -t 100% switch to match really same or VERY/too similar images!).
My full command:
findimagedupes -R -q -f $HOME/findimagedupes.index -t 100% -i 'VIEW(){ echo "Removing first of the duplicates."; rm "$1"; }' -a -- "/folderwithduplicatesthatmaybedeleted"
-R recursively search that folder
-q quiet (do not be verbose about incompatible files)
-f use signature index, that I am building right before executing previous command. The indexing command is: findimagedupes -R -q -f $HOME/findimagedupes.index --prune -n -- "/main-images-folder/"
-t 100% same or too similar images
-a is ensuring that only duplicates that are also in /folderwithduplicatesthatmaybedeleted are printed. Not duplicates that are only within index database (i have already dealt with these using findimagedupes -q -f $HOME/findimagedupes.index -t 100%
).
I'm glad it was helpful.
Note hat it is not guaranteed that files will be listed in any particular order. Personally, I would always rename the files, as you do in your 6th example, rather than deleting them immediately as you do in your 4th and 5th examples.
Note also that "$@"
refers to all the arguments, including "$1"
. If you wish to exclude "$1"
, you need the shift
from my example. I suppose you already know that rm
deletes an image from the filesystem entirely, not just from the output list.
If you just want to echo the filenames, you don't need to use your second command (-p "$(type -p echo)"
). Echoing is already the default behaviour.
-t 100%
means that there are no bit differences in the fingerprints. It does not guarantee that the images are similar (although obviously that is what is hoped for). It is possible for very dissimilar images to have the same fingerprint. As a simple example, an entirely blue image will exactly match an entirely red image. And an image of text could easily match another image of text even when the words are completely different. Also, because of the way that matches are grouped, dissimilar images can end up together (see https://github.com/jhnc/findimagedupes/issues/12#issuecomment-1610905081)
not guaranteed that files will be listed in any particular order
that is unfortunate and it does not seem that the VIEW variable accepts a condition and semicolons used in it: -i 'VIEW(){ ...condition.. then rm "$1"; }'
so does not seem to be possible to do like: if folder contains xy, then we have right folder to delete from
UPDATE: this condition may work:
-i 'VIEW(){ if echo "$1" | grep -q "$fwd"; then echo "Removing duplicate from a folder fwd."; rm "$1"; fi }'
Needs to define that variable $fwd before the command: fwd=/folderwithduplicatesthatmaybedeleted
In case it is not clear, VIEW
is an ordinary shell function. It is nothing specific to findimagedupes. It can contain any commands that you can write in a POSIX shell script. if
and test
syntax are both available.
Did you look at the output file produced with --script
option? The file is a very simple "skeleton" shell script that is designed to be customised if you need to do more complicated tasks. (If you edit this file, you don't need the extra quoting that the argument to the -i
option requires and you don't need to try to fit things onto a single line.) I wrote findimagedupes to use shell scripts for customisation so that the user did not need to learn a new programming language. (If you want to use more powerful bash
or zsh
functionality such as arrays, or regular expressions, you can edit the first "shebang" line to use that shell instead or the default /bin/sh
.)
Of course, this does mean you already have to know how to write shell scripts.
For general help on writing shell scripts, the findimagedupes issue queue isn't the best place to ask. I'm only one person but there is a large community of helpful people on forums such as the various Stack Exchange Q&A sites ( https://superuser.com, https://unix.stackexchange.com, https://stackoverflow.com, etc), and those sites already contain answers to many questions (and lots of example code). There are also many books and tutorials and courses if you want to learn in a more structured way.
Hello,
1)
findimagedupes outputs:
/path/ file name뷰á _ .jpg /path/ file뷰š -name .jpeg
so when i want to: A) remove first file B) open all paths inside viewer "gwenview"
i can not if i am not an awk/sed poweruser. Can you please suggest how to do it? Problem is the lack of unique separator and spaces in paths. I have been asking ChatGPT, but no luck.
2)
It would be handy if there is a switch where user can define prefix, suffix and the path separator.
For example --outprefix 'gwenview "' --outpathseparator '" "' --outsuffix '"'
Btw. I would expect output paths to be by default separated with quotation marks. newline - one per line
I am sorry if this is stupid or too demanding. You have mentioned that showing usage case examples incl. more complex commands may be handy inside manual / under -h switch.