mholt / timeliner

All your digital life on a single timeline, stored locally -- DEPRECATED, SEE TIMELINIZE (link below)
https://timelinize.com
GNU Affero General Public License v3.0
3.57k stars 116 forks source link

Change in Google Takeout format? #78

Closed lewurm closed 8 months ago

lewurm commented 2 years ago

Quoting from https://github.com/mholt/timeliner/wiki/Data-Source:-Google-Photos :

  • Google can change the Takeout archive format at any time, breaking this implementation. Please help maintain this feature if you use it!

Did this happen now? Exports larger than 50gb will be split now:

gtakeout1 gtakeout2

While the first archive of a split seems to be accepted fine by timeliner import, the remaining archives do not print anything (even with -v) and exit after a few seconds.

I also tried to unpack all the files and repackage them into a single large one, but timeliner import fails right away:

2022/01/08 20:57:38 [ERROR][google_photos/me@gmail.com] Importing: importing: walking metadata.json: walking ._IMG_1337.HEIC.json: decoding item metadata file Takeout/Google Photos/Album2021/._IMG_1337.HEIC.json: invalid character '\x00' looking for beginning of value

Maybe that's related to the way I repackage it? The file headers look like this:

$ file takeout-20220106T172751Z-001.tgz takeout-20220106T172751Z-all.tgz
takeout-20220106T172751Z-001.tgz: gzip compressed data, from FAT filesystem (MS-DOS, OS/2, NT), original size modulo 2^32 2273460736
takeout-20220106T172751Z-all.tgz: gzip compressed data, last modified: Fri Jan  7 20:52:46 2022, from Unix, original size modulo 2^32 395496448

where takeout-20220106T172751Z-all.tgz is my repackaged archive (on macOS).

Anyway that would be merely a workaround, but it would be great if timeliner import supports those split archives generated by Google.

mholt commented 2 years ago

Good question. I haven't tried with split takeout files yet. Am mobile right now but want to get this working. Contributions / proposals welcome here πŸ™‚

lewurm commented 2 years ago

From a quick look, it seems like all .json files are in the first split archive only. I might dig into the source code a bit tomorrow πŸ™‚

mholt commented 2 years ago

Ohh that's interesting... hmm, and somewhat problematic. Will think on this. Let me know if you think of something!

lewurm commented 2 years ago

So tried my repackaging idea again, but this time using GNU tar on macOS (brew install gnu-tar) and then timeliner at least doesn't trip:

$ cat takeout-20220106T172751Z-0*.tgz | gtar xzivf -
$ gtar -cvzf takeout-20220106T172751Z-all.tgz Takeout/

However, I still do not see GPS info in most pictures when doing timeliner import ... with the combined archive. Not sure what's going on, but it's definitely quite slow and does a lot of disk reading.

I was looking a bit at takeoutarchive.go regarding supporting multiple archives, but I think instead it would be easier and more performant if instead it would operate on the unpacked Takeout folder. It even looks like with archiver v4 that should be rather easy to do, while also keeping support for a single archive file?

mholt commented 2 years ago

Nice find with the gnu-tar fix. I also wonder if filenames like ._* are macOS-only or something weird.

However, I still do not see GPS info in most pictures when doing timeliner import ... with the combined archive. Not sure what's going on, but it's definitely quite slow and does a lot of disk reading.

One thought... if they already existed in your timeline, it's possible that timeliner is skipping those ones entirely. Or maybe our EXIF reader just isn't finding the data in some files for some reason.

It even looks like with archiver v4 that should be rather easy to do, while also keeping support for a single archive file?

Yep, exactly, and I've already got that working locally in Timeliner's successor, Timelinize:

And was the primary motivation for writing archiver v4.

It's my nights-and-weekends project so I still have a lot to do before it's polished enough to share, but I'm making progress :muscle:

mholt commented 8 months ago

I now have more info about Timelinize, as well as a Discord community if you want to help try it out and offer feedback. https://timelinize.com (also updated this project's README).