gilesknap / gphotos-sync

Google Photos and Albums backup with Google Photos Library API
Apache License 2.0
2.03k stars 168 forks source link

Using compare-folder: Download only extra_files and albums #218

Closed Cyber1000 closed 4 years ago

Cyber1000 commented 4 years ago

I just want to compare my local-folder with google, since I'm syncing with another mechanism. I'm using the docker-container if that matters ...

I don't want to download the whole google-photos, so I did this: --index-only --compare-folder /mylocalphotos /storage

This seems to work fine, I'm even getting a comparison folder, which states the extra_files (the links are pointing nowhere really cause the files aren't downloaded - extra_files in my case would be something like animations, ... which are created by google).

I would be nice to get 2 things:

Thanks for this project!

gilesknap commented 4 years ago

Hi @Cyber1000.

First, I did not know compare-files would work without downloading the files. One of the things it does is look in the EXIF of every file and use the unique identifier to match the comparison files. It does have fallback approaches when no unique ID is found and I guess that is what is happening if it is only getting information from the database.

The effect of this is that I would not expect the comparison to be very reliable. However most modern cameras are OK at creating unique filenames and so the inaccuracies would only occur on files that predate this (I'm not sure when this became the norm, maybe around 2012, but it would vary by manufacturer).

Your requested features do make sense as long as you were happy that the comparison is good for your set of images.

They are non-trivial features that won't be easy to test. I go to lots of trouble to make sure that gphotos-sync does really reflect the contents of the Google Photos library (with the exception of listed known issues). I don't think it would be easy to confirm that this was the case for your requested features.

I'll keep this on the todo list and have a think about how to implement.

Cyber1000 commented 4 years ago

Thanks for your fast answer, my use case won't be the default use case for most peoplen here, so I'll try to explain a little bit further:

I sync my phone with both local-server and google-server, so the names of the files should be equal. I regularly clean media from my phone and google-photos, but not my local-server. Before I clean media from phone I want to get sure that I have everything saved in nextcloud.

My approach was to get extra_files files with your app. Since they came from the same source a "name-matching" should be the only thing that needs to be done. I could also download everything with gphotos-sync again but it would be a waste of diskspace and downloadtime, if I just use this for comparison. So I will just need to download a few normal photos (which aren't synced for whatever reason by nextcloud), some media google creates automatically from time to time and the album links.

As said perhaps no normal usecase but a solution to my problem. Perhaps I'll go with a full download for now, it's somewhere at 10Gb on google, at least for now I have no local space problems.

Thanks for writing this program 👍

gilesknap commented 4 years ago

Thanks for the additional info. Just to be clear, your use case would still be likely to have duplicate filenames. The comparison feature ignores folder structure, because it is designed to be used against a different download service. So it still needs to distinguish all of the photos called by the same filename. For example, the google generated movies are all called Movie.mp4 so need additional metadata to distinguish them.

Cyber1000 commented 4 years ago

Ok thanks for clarification and I've seen the movie.mp4 yesterday, I didn't think about them earlier.

I tried something different today:

docker-compose run --rm gphotos-sync --archived --start-date 2020-01-01 --albums-path google-albums --compare-folder /downloaded /storage

What happens here:

To clarify:

Is this intended behaviour or is combining of --start-date and --compare-folder not a valid usecase?

gilesknap commented 4 years ago

I don't think compare-folder looks at start date.

Links to non-existent files is a bug (and a feature!). It would be easy for me to check for the file before making the link but I do like the fact that it provides an additional warning if there are missing files.

I agree that there should not be missing links to files when you have used a date range, but it would be hard to do that in a robust way given my current approach (rebuild all albums on any change) and limitations of the API (no date filters on album listings).

I'm afraid that the main focus of gphotos-sync is to get a full backup and it does that reasonably well. The date filters etc. were there for testing and recovering from partial failure. What you are trying to do is pushing the limits of what can be achieved.

The comparison feature is something I built to convince myself that I had a complete backup after a very messy transition from Google Drive Sync to Google photos only. It is a fiddly process and required quite a bit of manual checking from me, the duplicate file name problem is really troublesome when trying to replicate a database in a file system.

I could help you out by providing an option to not link files that have not been synced in both comparison and albums because I think this is an easy addition. But I'm not convinced that you will have a robust solution that you can trust.

Cyber1000 commented 4 years ago

Ok I have found another solution, I'm using the startdate now, getting a lot of "unlinked" links in comparison and albums but deleting them in a step afterwards. Since files and folders should be in the same filestructure since nextcloud creates images like 2020/04/imagefilename.jpg that shouldn't be a problem for me. And I'm going back only back a month or two so a clean sync with startdate shouldn't be a problem. As said nextcloud loses some photos (1-3 per month) and google-created-photos + albums are missing too.

So my solution was to extend the docker container with some additional scripts like this (deleting links which are pointing to nowhere and removing dirs which are empty then):

find "$cleandir" -xtype l -exec rm {} \;
find "$cleandir" -depth -type d -empty -exec rmdir {} \;

I fully agree that I'm using this tool in another way as intended, perhaps it would be a valid way to do something like I did in a poststep. Since a link pointing to nowhere is pointless (wordplay intended :-) )

Perhaps that or closing this issue, since there might not be enough people using it this way. I found a workaround/poststep, so it works for me.

Thanks for your help!

gilesknap commented 4 years ago

@Cyber1000 I have concluded that you are correct and creating broken links is of no benefit.

I have pushed a change to master which should not create links when the source file does not exist, both in albums and comparisons.

Give it a try.

gilesknap commented 4 years ago

Closing this for now, feel free to continue the conversation in this thread.