arsenetar / dupeguru

Find duplicate files
https://dupeguru.voltaicideas.net
GNU General Public License v3.0
5.47k stars 418 forks source link

Can't mark entries which reference directory name matches search query #347

Open michalfita opened 8 years ago

michalfita commented 8 years ago

I can't mark entries which reference directory name matches the search query, for example (first is reference):

Z:\Pictures\2014-09-13\_DSC7666.JPG`
C:\SDCards\101TEST\_DSC7666.JPG

Then if in the search field I type 2014-09-13 and hit enter, only entries in the directory 2014-09-13 are displayed with their dupes, but there is no way to mark any of these dupes.

If you ask why do I need that, the answer is simple - to recreate directory structure on drive C: without unnecessary copy files back and forth. Mainly to spare SSD drive excessive writes. I don't expect full automation, but marking of these manually by for example adding option Mark all selected, would be nice.

I've taken look at the code and I think in about 30 second I've found first offence against my idea:

def _is_markable(self, dupe):
        # [...]
        if self.__filtered_dupes and dupe not in self.__filtered_dupes:
            return False
        return True

that's just a wild guess, because I need to study better what really is __filtered_dupes.

ghost commented 8 years ago

I see what you mean. There's a reason for this seemingly bizarre behavior: we don't want users running a filter accidentally deleting files they don't mean to delete with the pattern "filter -> mark all -> delete". They could think that mark all would mark only dupes matching the filter, only to realize that "oh no, it matched all dupes of all groups having at least one member matching the filter".

That being said, I don't think that this behavior makes much sense and I'm open to the idea of changing it. After all, if a user wants to match only dupes that match the specified filter, they can always use the "Dupes Only" option.

My only fear would be to, with a change of this behavior, to cause prejudice to a user who would have already been accustomed to it and relying on it, thus leading that user to accidental deletions. But I don't think that the chances of this happening are very high.

So, if someone wants to submit a PR, I'll welcome it.

Also, to answer your question, __filtered_dupes is a set of all dupes matching directly the specified filter.