elisemercury / Duplicate-Image-Finder

difPy - Python package for finding duplicate or similar images within folders
https://difpy.readthedocs.io
MIT License
449 stars 66 forks source link

[CHANGE REQUEST] replacing 'output directory' with 'move_path' #40

Closed bojanmilevski closed 1 year ago

bojanmilevski commented 2 years ago

Hello. first of all I would like to thank you for creating and maintaining this project. It has certainly helped me finding a bunch of duplicate images through my enormous gallery.

I discovered this project 3/4 months ago. I needed a way for difPy.py to move my duplicate images to certain directories, but it was not possible. I edited the source code - which was really easy, having little to no Python experience prior to this.

As I recently wanted to make a pull request, I noticed that this repository had been updated, which meant that I had to update my version as well. Along with the updates, I noticed a new output_directory flag, which was only useful if using this program through the command line. I made my changes and would like to introduce my implementation.

Instead of the (now present) output_directory flag, I added move, silent_move and move_path as parameters to the __init__ function. Here are the details:

The currently implemented output_directory flag only works for the CLI, but not for python scripts, as it is not passed over to the __init__ funcion. As a result, I have removed the output_directory flag and replaced it with my move implementation. This version takes both the command line and scripts in mind.

I would be happy to submit a pull request with my changes, If this idea sounds good to you, so you can take a better look at how these changes would be implemented.

Looking forward to collaborating and contributing to this project as much as I can.

bojanmilevski commented 1 year ago

I would appreciate a suitable reply - I've been waiting for three months :)

elisemercury commented 1 year ago

Dear @bojanmilevski,

Thank you very much for your feedback and your suggestions! And please excuse my late reply - I had a few quite busy past weeks.

The output_directory parameter in the CLI is not used to move the duplicate files. Instead, it is used to define where the resulting output of difPy will be located i. e. the files containing the result dictionary, the lower_quality list and the stats dictionary. The original files scanned by difPy will always remain in their original folder.

I am not planning to implement such a feature directly in difPy, since it is out of the package's scope. Instead, what I would recommend to do is to run difPy, make use of the difPy output (the result dictionary) to locate the duplicate files, and then move them to a new location as you desire. This can be easily done in one line of code by following f. e. what is shown in this tutorial. But, as mentioned, this is not a feature I would suggest adding to the base difPy package by default.

I hope this helps and answers your request. Again, thank you very much for taking the time to share your ideas and feedback - I really appreciate it!

All the best and happy new year! Elise