garzj / google-photos-migrate

A tool to fix EXIF data and recover filenames from a Google Photos takeout, preserving albums within the directory structure.
https://npmjs.com/package/google-photos-migrate
MIT License
104 stars 11 forks source link

Implement full Takeout folder migration #7

Closed lukehmcc closed 9 months ago

lukehmcc commented 9 months ago

This MR adds the flag fullMigrate (as opposed to folderMigrate which implements the old functionality of folder -> folder migration), which automatically migrates an entire takeout file into AlbumsProcessed and PhotosProcessed which have all metadata sorted and can easily migrated to any platform.

This addresses #6

Things to keep in mind:

lukehmcc commented 9 months ago

Ah I messed up removing the build/ dir. I'll fix that and then update this comment.

UPDATE: fixed it

lukehmcc commented 9 months ago

Wow that made it way worse lol

mtalexan commented 9 months ago

@lukehmcc The README in your branch has the wrong instructions for your new command. It seems to need fullMigrate instead of migrateFolder, but I only figured that out by reading here.

The help text of the tool itself also makes no mention of the subcommands, it only prints the help text for the old method, which now also needs the migrateFolder subcommand.

EDIT: I was accidentally running help on a subcommand and didn't realize it.

mtalexan commented 9 months ago

It looks like your cleanup error handling still has some bugs too. After the initial processing is done, I'm getting output like this on my photo set:

Done! Processed 11256 files.
Files migrated: 11146
Files failed: 110
Rewriting all tags from /mnt/Google Photos/PhotosError/original_8785a13f-e6d2-45ce-a096-0c237dad625e_2(1).jpg, to  /mnt/Google Photos/Photos/cleaned-original_8785a13f-e6d2-45ce-a096-0c237dad625e_2(1).jpg
Cannot fix metadata for /mnt/Google Photos/PhotosError/iMarkup_20190704_122554.png.json as .json is an unsupported file type.
Rewriting all tags from /mnt/Google Photos/PhotosError/iMarkup_20190704_122554.png, to  /mnt/Google Photos/Photos/cleaned-iMarkup_20190704_122554.png
Rewriting all tags from /mnt/Google Photos/PhotosError/duplicates-2, to  /mnt/Google Photos/Photos/cleaned-duplicates-2
/app/node_modules/exiftool-vendored/dist/ExifToolTask.js:41
                    error = new Error(errMsg);
                            ^

Error: '/mnt/Google Photos/Photos/cleaned-duplicates-2' already exists - /mnt/Google Photos/PhotosError/duplicates-2/20230224_131215.jpg
    at RewriteAllTagsTask.parser (/app/node_modules/exiftool-vendored/dist/ExifToolTask.js:41:29)
    at RewriteAllTagsTask._Task_resolve (/app/node_modules/batch-cluster/dist/Task.js:146:40)

It kind of looks like the rewriteAllTags that you automatically run after the restructuring and renaming isn't properly filtering out the *.json files. Keep in mind that takeout can end up with *.json files named with or without the file extension of the file they're associated with, so you can't just look for .png, you have to look for .png$ (it has to be at the end of the file name).

It also looks like there's something with the duplicates to cleaned-duplicates that's either running more times than it should on the same file, isn't picking properly unique file names, or is erroneously blocking overwrites of the cleaned-duplicates file.


EDIT:

FYI, I cloned your exif-wrapper repo, changed the submodule to use your fork on this branch, built the Dockerfile, and then manually entered the resulting container with my takeout folder host-mounted at /mnt. I'm running all these commands from within the container, so the tool code being run is at /app.

mtalexan commented 9 months ago

Also fun fact, you seem to have accidentally messed up what folder you're putting Albums and Photos in when you restructure the folder. It's putting it into /path/to/takeout/Google Photos/{Albums,Photos} instead of the /path/to/takeout/{Albums,Photos} you error check at the start.

[...snip...]
Copying /mnt/Google Photos/Photos from 2023 to /mnt/Google Photos/Photos/Photos from 2023
[...snip...]
Copying /mnt/Google Photos/Wedding to /mnt/Google Photos/Albums/Wedding

Though based on your first comment in this PR, maybe you just messed up which folders you're error checking at the start of the command instead.

lukehmcc commented 9 months ago

@lukehmcc The README in your branch has the wrong instructions for your new command. It seems to need fullMigrate instead of migrateFolder, but I only figured that out by reading here.

~The help text of the tool itself also makes no mention of the subcommands, it only prints the help text for the old method, which now also needs the migrateFolder subcommand.~

EDIT: I was accidentally running help on a subcommand and didn't realize it.

Good catch on the documentation issue.

lukehmcc commented 9 months ago

It looks like your cleanup error handling still has some bugs too. After the initial processing is done, I'm getting output like this on my photo set:

Done! Processed 11256 files.
Files migrated: 11146
Files failed: 110
Rewriting all tags from /mnt/Google Photos/PhotosError/original_8785a13f-e6d2-45ce-a096-0c237dad625e_2(1).jpg, to  /mnt/Google Photos/Photos/cleaned-original_8785a13f-e6d2-45ce-a096-0c237dad625e_2(1).jpg
Cannot fix metadata for /mnt/Google Photos/PhotosError/iMarkup_20190704_122554.png.json as .json is an unsupported file type.
Rewriting all tags from /mnt/Google Photos/PhotosError/iMarkup_20190704_122554.png, to  /mnt/Google Photos/Photos/cleaned-iMarkup_20190704_122554.png
Rewriting all tags from /mnt/Google Photos/PhotosError/duplicates-2, to  /mnt/Google Photos/Photos/cleaned-duplicates-2
/app/node_modules/exiftool-vendored/dist/ExifToolTask.js:41
                    error = new Error(errMsg);
                            ^

Error: '/mnt/Google Photos/Photos/cleaned-duplicates-2' already exists - /mnt/Google Photos/PhotosError/duplicates-2/20230224_131215.jpg
    at RewriteAllTagsTask.parser (/app/node_modules/exiftool-vendored/dist/ExifToolTask.js:41:29)
    at RewriteAllTagsTask._Task_resolve (/app/node_modules/batch-cluster/dist/Task.js:146:40)

It kind of looks like the rewriteAllTags that you automatically run after the restructuring and renaming isn't properly filtering out the *.json files. Keep in mind that takeout can end up with *.json files named with or without the file extension of the file they're associated with, so you can't just look for .png, you have to look for .png$ (it has to be at the end of the file name).

It also looks like there's something with the duplicates to cleaned-duplicates that's either running more times than it should on the same file, isn't picking properly unique file names, or is erroneously blocking overwrites of the cleaned-duplicates file.

EDIT:

FYI, I cloned your exif-wrapper repo, changed the submodule to use your fork on this branch, built the Dockerfile, and then manually entered the resulting container with my takeout folder host-mounted at /mnt. I'm running all these commands from within the container, so the tool code being run is at /app.

It already checks for json endings before running rewriteAllTags. https://github.com/lukehmcc/google-photos-migrate/blob/master/src/cli.ts#L84 I'm not sure why you're running into this.

Also please do not use exif-wrapper, I'm migrating all of it's functionality to this repo with the the fullMigrate command. exif-wrapper is currently broken.

lukehmcc commented 9 months ago

Also fun fact, you seem to have accidentally messed up what folder you're putting Albums and Photos in when you restructure the folder. It's putting it into /path/to/takeout/Google Photos/{Albums,Photos} instead of the /path/to/takeout/{Albums,Photos} you error check at the start.

[...snip...]
Copying /mnt/Google Photos/Photos from 2023 to /mnt/Google Photos/Photos/Photos from 2023
[...snip...]
Copying /mnt/Google Photos/Wedding to /mnt/Google Photos/Albums/Wedding

Though based on your first comment in this PR, maybe you just messed up which folders you're error checking at the start of the command instead.

I don't see how this is a mistake? It moved the album to the album directory and the photos to the photo directory...

mtalexan commented 9 months ago

Also please do not use exif-wrapper, I'm migrating all of it's functionality to this repo with the the fullMigrate command. exif-wrapper is currently broken.

Yeah, I realized that. It would probably be more accurate to say I used your Dockerfile to create an npm environment with this tool in it. I didn't try your python script.

mtalexan commented 9 months ago

Also fun fact, you seem to have accidentally messed up what folder you're putting Albums and Photos in when you restructure the folder. It's putting it into /path/to/takeout/Google Photos/{Albums,Photos} instead of the /path/to/takeout/{Albums,Photos} you error check at the start.

[...snip...]
Copying /mnt/Google Photos/Photos from 2023 to /mnt/Google Photos/Photos/Photos from 2023
[...snip...]
Copying /mnt/Google Photos/Wedding to /mnt/Google Photos/Albums/Wedding

Though based on your first comment in this PR, maybe you just messed up which folders you're error checking at the start of the command instead.

I don't see how this is a mistake? It moved the album to the album directory and the photos to the photo directory...

I realized the mistake is actually which directory you were error checking for earlier. I misunderstood which was correct and which wasn't.

lukehmcc commented 9 months ago

@mtalexan Okay that should be better. I just did a test and it successfully processed & organized ~10k assets. Lmk how it goes for you.

(sorry testing took so long, I run mine on a network mounted drive which is quite slow)