inthreedee / photoprism-transfer-album

Transfer a Google Photos album to a new Photoprism album
GNU General Public License v3.0
109 stars 12 forks source link

Look up files by SHA hash, plus a bunch of usability/speed improvements #1

Closed cincodenada closed 2 years ago

cincodenada commented 2 years ago

Hello! I don't know if it really makes sense to merge this into this repo, since it adds a bunch of stuff that I don't want to put you on the hook to maintain or answer questions about, but I figured it was only polite to open a PR anyway. If you did want to merge we'd need to update the README a bit to reflect that it's an updated version of the original rather than a fork.

And I'm of course happy to just maintain my fork on its own, just wanted to properly appreciate the work I was building on. Thanks!

inthreedee commented 2 years ago

Thanks for the pull request! I’m glad someone else took the time to improve on this. I’m traveling at the moment and won’t be back for at least another month but, once I do, I’ll give it a proper look over.

If I do end up merging it, maybe I’ll add you to the project so we can have one source to maintain for everyone who finds this tool useful. Would you be okay with that or would you prefer to just maintain your own fork?

cincodenada commented 2 years ago

I would totally be up for being added to this project to help maintain it here, I think that makes a lot of sense. We can circle back on that when you're back 👍 Thanks for checking in, and I hope you enjoy your travels!

inthreedee commented 2 years ago

I'm back from my trip and am finally looking into this, now that I have more photos to import into my Photoprism.

@cincodenada am I correct that this takes a hash of the image from from the google takeout and then sends that up to Photoprism in a request to add a matching hash to the album?

That's where my needs differ from most people's. I have the original photos on my phone still and upload them to both google photos (because it's a faster and cleaner UX for sharing with friends and family) and to Photoprism (for my personal archive). Since google photos compresses images, I want to upload the originals to Photoprism. This means the hashes won't match, which is why I chose to do the search by file name originally.

I'm not sure if it's worth trying to handle both scenarios in one script or if we should keep yours as a separate fork. Thoughts?

shayaknyc commented 2 years ago

I'm back from my trip and am finally looking into this, now that I have more photos to import into my Photoprism.

@cincodenada am I correct that this takes a hash of the image from from the google takeout and then sends that up to Photoprism in a request to add a matching hash to the album?

That's where my needs differ from most people's. I have the original photos on my phone still and upload them to both google photos (because it's a faster and cleaner UX for sharing with friends and family) and to Photoprism (for my personal archive). Since google photos compresses images, I want to upload the originals to Photoprism. This means the hashes won't match, which is why I chose to do the search by file name originally.

I'm not sure if it's worth trying to handle both scenarios in one script or if we should keep yours as a separate fork. Thoughts?

My photoprism albums were initially scanned from the giant folder on my NAS where I sync all the original, high-res photos from my phone. Google Photos, likewise, uploads a reduced resolution version to their servers, where I grouped them into albums. I can confirm that this properly matches the hashes in google photos to the OG photos from my NAS, and it properly grouped the OG photos into the Albums created in photoprism, reflective of the photos that are in Google.

inthreedee commented 2 years ago

I can confirm that this properly matches the hashes in google photos to the OG photos from my NAS

Maybe I'm missing something, but if the takeout downloads google's compressed versions of the images, the hashes shouldn't match. The only way that makes sense to me is if you're paying for or are grandfathered in to the option to store the original quality files on google photos.

shayaknyc commented 2 years ago

I am grandfathered into the unlimited "high res" backup, and only for photos that were uploaded before the change-over. All the images in google photos are the reduced "high res" photos. Where I think you're mistaken, is that it's not comparing file hashes from google to your og photos. What this script is doing is comparing the google photos filename to the original photo filename, and then once it finds a match, it finds the Photoprism UUID for that photo, and then pulls it into the Photoprism album, based on Photoprism's UUID/hash.

See here from the README:

1. A new Photoprism album is created.
2. It scans the json files in the Google Takeout directory, pulling out the title field.
3. It scans the yml files in the Photoprism sidecar directory, attempting to find a matching filename.
4. Once it finds a match, it pulls the photo's UID from the yml file.
5. An API request is sent to the server to add that UID to the album.

This will only work if Photoprism already has scanned the original photos and has it stored in its database to find the filename to begin with. If photoprism doesn't already have that filename in its database, it won't have what to match up against.

Hope this helps clarify....

inthreedee commented 2 years ago

@shayaknyc I think we're talking about two different things here. Yes, my original version of this script does rely on file names to do the comparisons and find the image's UUID. This pull request is a proposed change to use file hashes instead of file names, and I was asking about how that might affect my specific use-case.

It's helpful though, to know that I'm not the only one who uploads originals to both google and Photoprism and needs a way to handle that scenario.

shayaknyc commented 2 years ago

🤦‍♂️ duuur....you're right, I got my threads mixed up! LOL Yeah, I don't see how the hashes would work in the case you're describing unless it's the exact same filename, and filesize as your original photos.

inthreedee commented 2 years ago

I'm going to merge this into a new branch where I can iterate on it a bit as I go through and understand how it all works

inthreedee commented 2 years ago

@cincodenada I've made a bunch of changes/improvements over in the hashes branch. There are now command line options for stuff like verbose mode, disabling batching, and using name matching instead of hash matching. I also restructured and reorganized a lot of it, quoted unquoted variables, tried to make the style more consistent, etc.

I'd really appreciate if you could take a look and make sure I didn't break any of your code in the process. Consider this a rough draft as of right now though; I haven't done much testing at all yet. Since my photos need to be matched by name, I'll only be able to test that mode.

I also haven't updated the readme at all yet.

Also also, let me know if you'd still like to be added as a maintainer on the project. It's totally fine if you'd prefer to maintain your own fork instead.

https://github.com/inthreedee/photoprism-transfer-album/tree/hashes