Closed Kaschi14 closed 1 year ago
Hi @Kaschi14,
Thanks for your question and for opening the issue! There are two cases to looking at the two-folder/multi-folder case:
Case 1 - Duplicate images have different image qualities: in this case, the image with the lowest quality (file size) will be deleted. The folder in which it will be deleted depends on which of the folders it is located in. Therefore, what you are observing might be due to different image qualities.
Case 2 - Duplicate images have the same image qualities i. e. are exact duplicates: in this case, as of the new update to difPy v3.0.0, only the duplicate pairs in the first directory argument should be deleted (as of my testing).
I hope this clarifies! Don't hesitate to let me know if you have further questions.
All the best, Elise
I thought it is always the images in folder at the 2nd argument. But actually it is not deterministic. In rare cases a file in the first folder argument gets deleted. It happened only once within 69 deletions.
See example log for dif("F:\outtakes\outtakes_pos\bboxes\", "F:\1.0.0\experiment_pos\bbox\", recursive=False, delete=True):
... Deleted file: F:\experiment_pos\bbox\89738260-3d58-4045-ba7d-b534ddff2b82_2.png Deleted file: F:\experiment_pos\bbox\7edc9a3e-a9f6-4229-bbfc-62cf97495f4e_2.png Deleted file: F:\outtakes\outtakes_pos\bboxes\77c4d676-050d-43fe-ab9f-13740c77763f_2.png Deleted file: F:\experiment_pos\bbox\f8e88fc0-f5c3-4a1e-a2e8-18a252bad860_2.png Deleted file: F:\experiment_pos\bbox\7f9b605c-159a-4a8d-8999-b0223d7ab7d1_2.png ...