TheLastGimbus / GooglePhotosTakeoutHelper

Script that organizes the Google Takeout archive into one big chronological folder
https://aur.archlinux.org/packages/gpth-bin
Apache License 2.0
3.48k stars 176 forks source link

version: 3.4.1 maintains duplicates in output folder #231

Closed jonno85 closed 10 months ago

jonno85 commented 10 months ago

On macOS, launching gpth and selecting the default cases, produces the output folder in which the ALL_PHOTOS folder contains the original file and the augmented file with the metadata packaged and the suffix -edited. Shouldn't the original file be removed? or at least have a menu option to maintain it? When it is time to upload to Synology, the different name won't be recognized as a duplicated file.

TheLastGimbus commented 10 months ago

is the -edited file excatly, checksum same as original file? if not, then gpth is doing a good job :relieved:

or at least have a menu option to maintain it?

you mean you would want --skip-extras option

https://github.com/TheLastGimbus/GooglePhotosTakeoutHelper/blob/ac0a26db7e14772fda39732dbf4f8645a6bd19b6/bin/gpth.dart#L34C3-L34C3

to be added back? it allowed removing the -edited stuff

tho i don't think it's that needed, since you can just ctrl+f, "-edited", select all and delete, in your file manager :+1:

jonno85 commented 10 months ago

thanks @TheLastGimbus for the quick answer. Yes the checksum is different and I guess the metadata are merged into the file. My need is, considering that you want to drag the output folder into Synology, then I need to avoid the original file, otherwise it will show duplicated pictures. Unfortunately, Mac Finder doesn't allow me to negate the search and easily remove all the original not -edited files. Is it more clear?

TheLastGimbus commented 10 months ago

Ahhh, i see your issue now

by "being checksum different", i wanted to point that those are not two same photos - one is the original and one is edited

i see why you would want to delete non-edited one... hmm...

well, i thiiiink that with current complexicity of how gpth matches/searches for stuff, it wouldn't be easy to do this safely for it too - without, you know, accidentally removing random photos (because of whole photo.jpg, photo(1).jpg, photo.jpg(1) etc stuff google made)

i think we could write a simple python script to resolve your problem tho...

TheLastGimbus commented 10 months ago

here you go buddy

import os
import shutil

for f in os.listdir():
  if '-edited' in f:
    og = f.replace('-edited', '')
    print('to rm:', og)
    # uncomment this to actually run
    # os.remove(og)
    # or, safer one - create a "to-rm" directory beforehand
    # shutil.move(og, 'to-rm')

would be grateful for a small donate :wink:

TheLastGimbus commented 10 months ago

may re-open in future in any case people really want skipping the original non-edited photos...