TheLastGimbus / GooglePhotosTakeoutHelper

Script that organizes the Google Takeout archive into one big chronological folder
https://aur.archlinux.org/packages/gpth-bin
Apache License 2.0
3.88k stars 191 forks source link

Photos date not detected correctly in v3 #175

Closed denouche closed 1 year ago

denouche commented 1 year ago

Hello,

I'm still playing with the v3 and comparing the results with the v2, and I see some differences in the date detection. I use the divide date option. I ran both v2 and v3 on the same input takeout, and for some photos the v2 can detect the good month and v3 put the photo in a wrong month folder.

Let me give you the full example. In my folder "Photos from 2010" I have these 3 files: image

Here are the 3 JSON files from takeout : jsons.zip

The photo IMG_1005.JPG has been taken on 2010-12-25 and is in the folder 2010/12, both in v2 and v3, this one is okay. The photo IMG_1005(1).JPG has been taken on 2010-05-05 and is in the folder 2010/05 in v2, but in the folder 2010/12 in v3, this one is not okay. The photo IMG_1005(2).JPG has been taken on 2010-04-19 and is in the folder 2010/04 in v2, but in the folder 2010/12 in v3, this one is not okay.

Without having read the code I think the bug may come from the fact that in the 3 JSON files we have: "title": "IMG_1005.JPG", so the v3 read the JSON file, search for the IMG_1005.JPG file, and extract the wrong date.

Thank you in advance for your help!

TheLastGimbus commented 1 year ago

The best bug report i could ask for 🤤

Do jpgs have exifs? If so i could split current json date extractor for one that looks for exact names and other that seaches for ones with extra (1) etc - and put the second one after the exif extractor

denouche commented 1 year ago

Yes the photos have exifs. I can extract you the exifs of the 3 photos if you want?

TheLastGimbus commented 1 year ago

No need to - you can just comfirm that exif has correct date (if we want to be sure) 👍 (leave a like reaction)

TheLastGimbus commented 1 year ago

Looking closer at this

bug may come from the fact that in the 3 JSON files we have: "title": "IMG_1005.JPG"

current json extractor doesn't read this title at any point - it just takes image filename, and tries to strip it to find good json

  • IMG_1005.JPG -> IMG_1005.JPG.json
  • IMG_1005(1).JPG -> IMG_1005.JPG(1).json
  • IMG_1005(2).JPG -> IMG_1005.JPG(2).json

see this? images have (n) before .JPG extension, then jsons after it

these are the reasons this repo is still active after 3 years ;_;

i think this is such an edge-case that i won't try to implement this (for finding json) - this could lead to other bugs, and falling back tu built-in exif/name-guessing is more than enough :+1:

tl;dr should be fixed in a short time

TheLastGimbus commented 1 year ago

:tada:

thanks for reporting this nicely again !

denouche commented 1 year ago

Oh yes I see the problem, good catch! Sorry for the answer delay, I moved house last week and I was away away from keyboard. Anyway thank you so much for the fix and more globally for this project.