arsenetar / dupeguru

Find duplicate files
https://dupeguru.voltaicideas.net
GNU General Public License v3.0
5.26k stars 412 forks source link

Reference system highly annoying #312

Open a-raccoon opened 9 years ago

a-raccoon commented 9 years ago

I really enjoy using dupeGuru, however, I find the "reference" system rather counter-intuitive. When de-cluttering heaps of media across multiple drives and folder structures, there's no singular reference system for which copy of a duplicate I wish to keep. I'm spontaneous and full of whimsy! So I constantly find myself having to Ctrl+Space the file I wish to keep, on top of check-marking the files I wish to delete.

Can't dupeGuru just be my guardian angel and warn me if I attempt to delete all copies of a file? Could you add a feature that disables the safeguards and allows me to mark any file arbitrarily?

It's worth noting that dupeGuru chooses its "references" quite arbitrarily, leaving a bunch of files from a collection in one folder, and a bunch others in the sister folder.

a-raccoon commented 9 years ago

I suppose I should elaborate with how I would use dupeGuru differently, and why.

One of the most common reasons I wind up with dupes is from years of cyclic backups of various flash drives, mp3 players and mobile devices. Worse is when I decide to build a new computer or torch my OS to start fresh. Files get dumped from one drive to another, and often times stack and nest several layers deep. So here are the duplicates I actually encounter.

E:\MUSIC\song.mp3 F:\BACKUPS\ipod-shuffle\june-mix\song.mp3 F:\BACKUPS\MEDIA\MUSIC\song.mp3 G:\to-be-sorted\c\users\me\downloads\song[320].mp3

When I see a bunch of these, here is my chain of thought.

I want to keep the song on my media drive in the \MUSIC folder, and I want to keep it in my backup drive in \BACKUPS\MEDIA\MUSIC. So I want to exclude those two paths right away, and I would like to exclude them from all other dupes as well. This is something I didn't think about before I performed the dupe search, so it would have to be possible I've seen the results. In fact, I would like to exclude my entire \BACKUPS folder, because they shouldn't constitute as dupes -- they exist intentionally with a counterpart somewhere else.

I want to delete the song from the \june-mix, since that was just a backup of my ipod. In fact, I want to nuke all dupes from that mix directory RIGHT NOW and then review the contents (in Windows Explorer) to see if there's anything left -- because I'm suddenly fixated on whether there's anything unique in there now.

And I want to delete the copy from the old dump of my C drive when I last reinstalled Windows. In fact, I want to delete ALL dupes in G:\to-be-sorted\c\, regardless of what it is, unless it would delete all remaining copies.

Not sure how I can do these things with the current arrangement. There's no post-scan filtering system like that.

ghost commented 9 years ago

Did you try the Re-Prioritize dialog? Folder-based re-prioritization looks like what you're trying to achieve.

a-raccoon commented 9 years ago

It's kind of got the idea of what I'm talking about, but not nearly as versatile.

So say in my examples above, how would I stricken my BACKUPS* folder from the search results without performing a new search? Since I have lots of things backed up, it sees almost every important photo or document seem like a duplicate. I could go and manually select each individual result with CTRL+Clicking, after typing BACKUP into the top-right filter box, so I could "Remove Selected from Results"... but it's a lot of manual mouse work. Here it would be useful to have an option to "Select" (not mark) every result that matches a filter string.

Inversely, I often want to delete every instance of a file because they are garbage. Eg, I accidentally backed up my Cache* directory, with lots of huge files in it, and so I want to select everything that's \Photoshop\Cache\ and delete both the duplicate and its "reference".

If nothing else, the most annoying thing about references is you can only "Make Selected into Reference" but not "Make Selected Not-Reference." (giving 'reference' to one of the other sister dupes.)

Performing a new Scan is often necessary right now, because of results that slip in, but this is pretty grueling when you have 16 terabytes to scan again.

ghost commented 9 years ago

If you have filtered results, select all + remove is supposed to do the trick. You don't have to individually select single rows: reference files are ignored and not removed with that action.

... And even if it did, you could use the "Dupes only" option to display only dupes.

Did you try that?

a-raccoon commented 9 years ago

Yes. I'm not sure how this process helps me "remove" (not delete) all of the F:\BACKUP\ matches from the list. Thereby leaving in its place a different qualifying "reference" to take its place, if there are still 2 qualifying dupes. Otherwise "remove" (not delete) the counter part of the match as well from the list (not a dupe after all, just a backup).

E:\MUSIC\song1.mp3 F:\BACKUP\MEDIA\MUSIC\song1.mp3 E:\MUSIC\song2.mp3 F:\BACKUP\MEDIA\MUSIC\song2.mp3 G:\to-be-sorted\c\users\me\downloads\song2[320].mp3 E:\MUSIC\song3.mp3 F:\BACKUP\MEDIA\MUSIC\song3.mp3

I want to first select and "remove" (not delete) all matches under F:\BACKUP. Doing so would totally remove song1.mp3 and song3.mp3 from the search results, and only matches remaining would be:

E:\MUSIC\song2.mp3 G:\to-be-sorted\c\users\me\downloads\song2[320].mp3

ghost commented 9 years ago

Sorry, I have a hard time understanding what you mean. In the example case you've given doing:

  1. filter the list with "F:\BACKUP*"
  2. Go in "Dupes Only" mode
  3. Select all
  4. Remove selected

should result in the remaining dupes you've given as an example. Isn't it what you get when you try this? If so, what do you get?

patrickatamaniuk commented 8 years ago

+1 on disabling all safeguards (one by on as a setting) Let me learn the code first, then i might propose advanced settings like this.

jimhester2 commented 8 years ago

I have a slightly different question. I'd like to be able to change the folder that is highlighted in blue after a scan. I have a main file containing all my photos that I did for a backup that are not broken down into subfolders. When I do a scan, the folders I have (where I want the photos to remain) are the only ones that can be checkmarked for deletion. I would like to be able to set one of those folders to the the one highlighted, so I can delete the one out of my main, unsorted folder. Any ideas?

ghost commented 8 years ago

@jimhester2 you mean a reference folder

jimhester2 commented 8 years ago

Virgil........After going back through the process again, I realized what you were talking about. I hadn't realized that I needed to set the reference folder before the scan. A little short on sleep today. It's working great now. Thank you very much!

On Mon, Sep 5, 2016 at 5:43 PM, Virgil Dupras notifications@github.com wrote:

@jimhester2 https://github.com/jimhester2 you mean a reference folder https://www.hardcoded.net/dupeguru/help/en/faq.html#i-have-a-folder-from-which-i-really-don-t-want-to-delete-files

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/hsoft/dupeguru/issues/312#issuecomment-244814266, or mute the thread https://github.com/notifications/unsubscribe-auth/AUsDv8fY8twksS4pSzgcKO-6sIS0oXTdks5qnIzpgaJpZM4FDr3I .

vertigo220 commented 4 years ago

I just tried reprioritizing the results to see if that would do what I needed, and it does, with one important caveat: if there are multiple folders the user wants to use as references, it becomes a major pain, as discussed in issue #297. I've gone into more detail there.

I also just saw from the last couple posts here about setting a reference folder before the scan. There are, however, multiple problems with that. First, it requires the forethought to do this, as well as knowing ahead of time which folder to use for it, which won't always be the case, as sometimes I decide which folder to use as a reference after seeing the differences between them, e.g. one may have images tagged or rotated and the other may not, something I may not be aware of until after doing the scan and checking some of the results. Another problem is they can only be set one at a time, meaning that if there are many folders the user wants to use as references compared to others (see my post in issue #297 for an example), it would be a pain to do so (especially since clicking on that area doesn't open the drop-down, it only selects the row and a second click is required to open it). As mentioned there, although about a different part of the program, if it at least had the ability to designate a folder as duplicate instead of reference, that could, at least in some circumstances, help a lot, as it would only require designated one or two folders instead of possibly dozens. Of course, making it so multiple folders can be selected and changed simultaneously would be better. I suppose that if the user a) thought to do so before the scan, b) knew which folder(s) they wanted to use for which, and c) temporarily moved all folders they want to use for reference under a single folder that could then be chosen in dupeguru, it would work, but that's far from ideal.

vertigo220 commented 4 years ago

Just found another problem with setting the reference folder before the scan: it can't be overridden. Setting folder A as reference and performing a scan, even though for most duplicates the user may want to keep the version in folder A, there may be some they prefer to keep the one in folder B, but selecting the duplicate, which is in folder B, and trying to make it the reference does nothing.