simulot / immich-go

An alternative to the immich-CLI command that doesn't depend on nodejs installation. It tries its best for importing google photos takeout archives.
GNU Affero General Public License v3.0
1.88k stars 55 forks source link

feat: exclude EFFECTS and COLLAGE from duplicated images #111

Closed gvillo closed 5 months ago

gvillo commented 10 months ago

After running duplicate command I found there is no dry-run param, it would be nice to see a list of files that is going to be deleted before going one by one.

On the other hand, I found Google Photos is generating COLLAGE or EFFECTS files with the same filename with a +# suffix, e.g.

EFFECTS-20222506-102538.jpg
EFFECTS-20222506-102538+1.jpg
IMG_20171114_104841988-COLLAGE-20174814-104842.jpg
IMG_20171114_104841988-COLLAGE-20174814-104842+1.jpg
IMG_20180526_102517195-EFFECTS-20182526-102517.jpg
IMG_20180526_102517195-EFFECTS-20182526-102517+1.jpg

These files are not duplicates, they are all generated in the same time, same filename, but different images and I would like to keep those files. I might be wrong and this is not generated by Google Photos, but if we can provide some sort of exclude file list too it would be awesome.

simulot commented 10 months ago

Thank you for this report. I think the -dry-run is by default.

Maybe, I should add a way to get the list of duplicates.

The duplicate command intend was to eliminate the duplicates from the same photos due to the google photo compression. You have the original photo at full resolution coming from immich app, and the compressed version coming from the google takeout. immich server accepts both because their SHA1 hash are different. Files are stored in immich as IMAGE.jpg and IMAGE+1.jpg, and both have the same name in the UI. The duplicate command detects 2 files with the same date, and the same visible name, but with a different size as duplicate. The bigger file is kept.

So, you files are effectively detected as duplicates.

Your suggestion is good. I can already exclude COLLAGE and EFFECTS from the duplicates. Because anything else could exist, I must add the possibility to give custom patterns

simulot commented 9 months ago

@gvillo I have tested with collages done in GP web page. But they have slightly different names.

So the duplicate command doesn't detect them as duplicate.

How did you come with this situation?

gvillo commented 5 months ago

@simulot sorry for the delay! the issue is when files are named something.jpg and something+1.jpg (not ~1 or ~2) the duplicated feature was treating them as duplicates. I didn't test it out newer versions of immich-go. I have to restore the deleted files in first place now 😅.

simulot commented 5 months ago

Lot of changes done since them. I close the issue. Feel free to reopen it