Install app using the Nextcloud App Store inside your Nextcloud. See https://apps.nextcloud.com/apps/duplicatefinder
Click on the icon and find if you have duplicate files.
You can either use the command or the cron job to do a full scan. Each time a new file is uploaded or a file is changed the app automatically checks if a duplicate of this file exists.
There are three possible ways duplicates can be detected.
Normally the detection methods should be used in the order as listed, but if you are installing the app on an existing installation it can be quite useful to start with a full scan by using command-based detection.
To prevent a folder from being scanned for duplicates, place a .nodupefinder
file inside it. Any files in this folder will be excluded from the duplicate detection process.
You can search and filter duplicates by file path or name using the search input at the top of the "Unacknowledged" and "Acknowledged" sections.
.png
to show all PNG files).!.ptx
to exclude all PTX files).occ [-v] duplicates:ACTION [options]
Depending on your Nextcloud setup, the occ
command may need to be called differently, such as sudo php occ duplicates:find-all [options]
or nextcloud.occ duplicates:find-all [options]
. Please refer to the occ documentation for more details
If you increase the verbosity of the occ command, the output shows a little bit more (e.g. what file is currently scanned).
ACTION
find-all The command scans the files for duplicates. By using the options the scan can be limited
options
-u, --user scan files of the specified user (-u admin)
-p, --path limit scan to this path (--path="./Photos"). The path is relative to the root of each user or the specified user.
list The command lists all duplicates that have been found yet. If no option is given duplicates across users are shown.
options
-u, --user list only duplicates of the specified user (-u admin)
clear The command will clear all information that has been stored in the database
options
-f, --force the flag forces to do the cleanup. _attention_ you will not be asked any questions
The app depends on the following settings. All settings should be modified only through UI. If this doesn't work for you, you can apply them via the occ-command.
Setting | Type | Default | Effect |
---|---|---|---|
ignore_mounted_files | boolean | false | When true, files mounted on external storage will be ignored. Computing the hash for an external file may require transferring the whole file to the Nextcloud server. So, this setting can be useful when you need to reduce traffic e.g if you need to pay for the traffic. |
disable_filesystem_events | boolean | false | When true, the event-based detection will be disabled. This gives you more control when the hashes are generated. |
backgroundjob_interval_cleanup | integer | 432000 | Interval in seconds in which the clean-up background job will be run |
backgroundjob_interval_find | integer | 172800 | Interval in seconds in which the background job, to find duplicates, will be run |
Big thanks to @PaulLereverend @chrros95 @Routhinator