gilesknap / gphotos-sync

Google Photos and Albums backup with Google Photos Library API
Apache License 2.0
2.02k stars 165 forks source link

Duplicate Photos from Shared Album Downloads #436

Closed Doug411 closed 1 year ago

Doug411 commented 1 year ago

First, I want to thank you for creating this. I installed it yesterday, and am highly impressed with how well it works, as well as the richness of features

However, I am getting a lot of duplicate photos in my backup. I believe it is because, within Google Photos Online, I mistakenly "downloaded" thousands of photos from a shared album into my google photos.

Now, when Gphotos-sync executes, it first downloads my google photos (which contains photos from the shared albums that I do not own). Then it downloads the shared album photos ---and I get duplicates.

Is there a current way of avoiding the duplication? If not, would it be possible to add a flag to NOT download photos from shared albums where the name and file size are the same as a photo already downloaded? Or do you have a better idea?

FYI... my setup is rather complex. I'm running gphotos-sync on a raspberry pi (which shares double duty for home automation and data transformation). My media storage is an 18 TB NTFS drive attached to an Nvidia Shield. After some struggling (and looking at this forum), I was able to get symbolic links to work, but only on the Rasperry Pi (not on my Nvidia Shield, or within Plex on the Nvidia Shield-- for obrious reasons). I did this by leveraging the NTFS flag within gphotos-sync, and by mounting the NTFS share to my Raspberry PI as a CIFS 3.0 drive with the mfsymlinks option (and getting rid of nounix). I spun up another Plex server on my Raspberry Pi for photos only, since it can see the symbolic links. I linked the second Plex Server into my Plex account. Its working great! Al my media is within Plex, its seamless across both Plex Servers, and I have full functionality including photo albums! If I could solve the duplicates, it would be perfect!

gilesknap commented 1 year ago

Hi @Doug411 in theory Google Photos itself should not store duplicates - it recognizes identical images uploaded twice.

If you really are getting duplicates then I would expect that you somehow have two copies of a photo with minor edits / different metadata in your library.

gphotos-sync should not download the same photo twice, even if it is referenced in multiple albums (including shared). It uses the guids that Google attaches to photos to determine that it already has a given item.

When you say you get duplicates, do you mean in the photos folder itself - are they appearing with a (2) suffix?

gilesknap commented 1 year ago

I should also say that I have exactly your situation. I have shared albums but have also added the photos to my own library so I get to keep them if the owner takes down the shared album. This does not result in local duplicates for me. (at least that's what I assume you mean by "downloaded" - maybe you could elaborate on the process you used to get those shared album photos)

Doug411 commented 1 year ago

Yes, my situation is same as yours! The photos in my local _albums created by gphotos-sync arent duplicated. However, I have a lot of photos with (2) in the suffix within my local copy of photos that is created by gphotos-sync. I assumed that's because gphotos-sync first downloads my photos, and then merges (by month) the photos from my shared albums? Is that a correct understanding of why I have so many (2) photos in the local copy?

Also, I noticed today that a few new photos were added to the shared album (I dont own the album). However gphotos-sync isnt triggering another sync. It claims it is up to date. What do I need to do to get it to recognize when new photos post to the shared album? (FYI the shared album has 8K+ photos.....i havent counted how man 2's I have, but its significant)

gilesknap commented 1 year ago

@Doug411 please can you check your local duplicate files and look to see if they are identical sizes?

gphotos-sync should not merge in the photos from shared albums because it should see that they are identical files and that it already has them. The /albums/ folders are just soft links into /photos., so the same photo can appear in multiple albums with no issues.

Doug411 commented 1 year ago

Yes, Idential sizes. My recollection is that when it started syncing the shared album, it first downloaded (missing) photos to (Photos-Doug) . Then it created the symbolic links in Albums-Doug. It was during that initial download to Photos-Doug that it started creating (2) files because many of them were already included in my photos. Most of the symbolic links it created for albums are linked to the (2) files. See sample extract below. Also, it has not downloaded anything new from the shared album since the VERY FIRST RUN, even though new photos have been posted almost every day. I have the EXACT same log file every day. I'll post my log file as well as my cron script that runs daily below. Im encouraged by what you stated (that something is wrong as its not intended to create redundant files if the filename and size are identical)

/mnt/MediaServer/Photos/Photos-Doug/2021/07 $ ls -l total 1429504 -rwxrwxrwx 1 docker docker 728745 Jul 15 2021 '1626416385564 (2).jpg' -rwxrwxrwx 1 docker docker 728745 Jul 15 2021 1626416385564.jpg -rwxrwxrwx 1 docker docker 5184797 Jul 3 2021 '20210703_171352 (2).mp4' -rwxrwxrwx 1 docker docker 5184797 Jul 3 2021 20210703_171352.mp4 -rwxrwxrwx 1 docker docker 6167278 Jul 3 2021 '20210703_171425 (2).mp4' -rwxrwxrwx 1 docker docker 6167278 Jul 3 2021 20210703_171425.mp4 -rwxrwxrwx 1 docker docker 1670505 Jul 3 2021 '20210703_173545 (2).jpg' -rwxrwxrwx 1 docker docker 1670505 Jul 3 2021 20210703_173545.jpg -rwxrwxrwx 1 docker docker 12132103 Jul 3 2021 '20210703_173546 (2).mp4' -rwxrwxrwx 1 docker docker 12132103 Jul 3 2021 20210703_173546.mp4 -rwxrwxrwx 1 docker docker 1447316 Jul 3 2021 '20210703_175821 (2).jpg' -rwxrwxrwx 1 docker docker 1447316 Jul 3 2021 20210703_175821.jpg -rwxrwxrwx 1 docker docker 1597525 Jul 3 2021 '20210703_175829 (2).jpg' -rwxrwxrwx 1 docker docker 1597525 Jul 3 2021 20210703_175829.jpg -rwxrwxrwx 1 docker docker 1912070 Jul 3 2021 '20210703_175831 (2).jpg' -rwxrwxrwx 1 docker docker 1912070 Jul 3 2021 20210703_175831.jpg -rwxrwxrwx 1 docker docker 1572560 Jul 3 2021 '20210703_175832 (2).mp4' -rwxrwxrwx 1 docker docker 1572560 Jul 3 2021 20210703_175832.mp4 -rwxrwxrwx 1 docker docker 1686191 Jul 3 2021 '20210703_180015 (2).jpg' -rwxrwxrwx 1 docker docker 1686191 Jul 3 2021 20210703_180015.jpg -rwxrwxrwx 1 docker docker 1552236 Jul 4 2021 '20210704_200643 (2).jpg' -rwxrwxrwx 1 docker docker 1552236 Jul 4 2021 20210704_200643.jpg -rwxrwxrwx 1 docker docker 1707663 Jul 4 2021 '20210704_200646 (2).jpg' -rwxrwxrwx 1 docker docker 1707663 Jul 4 2021 20210704_200646.jpg -rwxrwxrwx 1 docker docker 1881334 Jul 4 2021 '20210704_200651 (2).jpg' -rwxrwxrwx 1 docker docker 1881334 Jul 4 2021 20210704_200651.jpg -rwxrwxrwx 1 docker docker 1337189 Jul 4 2021 '20210704_203128 (2).jpg' -rwxrwxrwx 1 docker docker 1337189 Jul 4 2021 20210704_203128.jpg -rwxrwxrwx 1 docker docker 1392022 Jul 4 2021 '20210704_203132 (2).jpg'

Dougie@pivpn:/opt/docker/gphotos $ sudo cat start-gphotos.sh && cat gphotos.log

docker run --rm -v /opt/docker/gphotos:/config -v /mnt/MediaServer/Photos:/storage -it ghcr.io/gilesknap/gphotos-sync --ntfs --db-path=/config --logfile=/config/gphotos.log --albums-path=/storage/Albums-Doug --photos-path=/storage/Photos-Doug /storage

06-23 05:22:36 gphotos_sync.Main WARNING gphotos-sync 3.1.2 2023-06-23 05:22:36.549688 06-23 05:22:36 gphotos_sync.Utils DEBUG MINIMUM_DATE = 1800-01-01 00:00:00 06-23 05:22:36 gphotos_sync.Checks DEBUG Checking if is filesystem supports symbolic links... 06-23 05:22:36 gphotos_sync.Checks DEBUG attempting to symlink /storage/test_src_2663519609 to /storage/test_dst_1785921432 06-23 05:22:36 gphotos_sync.Checks DEBUG Checking if File system supports unicode filenames... 06-23 05:22:36 gphotos_sync.Checks INFO Filesystem supports Unicode filenames 06-23 05:22:36 gphotos_sync.Checks DEBUG Checking if File system is case insensitive... 06-23 05:22:36 gphotos_sync.Checks INFO Case insensitive file system found 06-23 05:22:37 gphotos_sync.Checks INFO Max Path Length: 4096 06-23 05:22:37 gphotos_sync.Checks INFO Max filename length: 255 06-23 05:22:37 gphotos_sync.Main INFO version: 3.1.2, database schema version 5.7 06-23 05:22:37 gphotos_sync.BadIds DEBUG bad_ids file, loaded 0 bad ids 06-23 05:22:37 gphotos_sync.GooglePhotosIndex WARNING Indexing Google Photos Files ... 06-23 05:22:37 gphotos_sync.GooglePhotosIndex INFO searching for media start=2023-05-31 18:09:34, end=None, videos=True 06-23 05:22:37 gphotos_sync.GooglePhotosIndex DEBUG mediaItems.search with body: {'pageToken': None, 'pageSize': 100, 'filters': {'dateFilter': {'ranges': [{'startDate': {'year': 2023, 'month': 5, 'day': 31}, 'endDate': {'year': 3000, 'month': 1, 'day': 1}}]}, 'mediaTypeFilter': {'mediaTypes': ['ALL_MEDIA']}, 'featureFilter': {'includedFeatures': ['NONE']}, 'includeArchivedMedia': False}} 06-23 05:22:38 gphotos_sync.GooglePhotosIndex DEBUG Skipped Index (already indexed) 1 /storage/Photos-Doug/2023/05/pxl_20230531_180934295.jpg 06-23 05:22:38 gphotos_sync.GooglePhotosIndex DEBUG Skipped Index (already indexed) 2 /storage/Photos-Doug/2023/05/pxl_20230531_180857968.jpg 06-23 05:22:38 gphotos_sync.GooglePhotosIndex DEBUG Skipped Index (already indexed) 3 /storage/Photos-Doug/2023/05/pxl_20230531_180848589.mp.jpg 06-23 05:22:38 gphotos_sync.GooglePhotosIndex DEBUG Skipped Index (already indexed) 4 /storage/Photos-Doug/2023/05/pxl_20230531_180838275.mp.jpg 06-23 05:22:38 gphotos_sync.GooglePhotosIndex DEBUG Skipped Index (already indexed) 5 /storage/Photos-Doug/2023/05/pxl_20230531_180745712.jpg 06-23 05:22:38 gphotos_sync.GooglePhotosIndex DEBUG search_media parsed 5 media_items with 100 PAGE_SIZE 06-23 05:22:38 gphotos_sync.GooglePhotosIndex WARNING indexed 0 items 06-23 05:22:38 gphotos_sync.GooglePhotosDownload WARNING Downloading Photos ... 06-23 05:22:38 gphotos_sync.GooglePhotosDownload WARNING Downloaded 0 Items, Failed 0, Already Downloaded 24249 06-23 05:22:38 gphotos_sync.LocalData INFO Saving Database ... 06-23 05:22:38 gphotos_sync.LocalData INFO Database Saved. 06-23 05:22:38 gphotos_sync.Main WARNING Done. 06-23 05:22:38 gphotos_sync.Main INFO Elapsed time = 0:00:02.039999

gilesknap commented 1 year ago

This does sound like a bug. I have the same scenario as I said and I don't get this issue. For me to look into this in detail I would need to get a copy of your database file gphotos-sync.sqlite

The data you would be sharing is just the files names of all of your photos and names of your albums. I can't use it to access your library as I'd need your token for that.

If you are OK with that please can you put it on a shared file system like dropbox or google docs and email the link to me at gilesknap@gmail.com.

gilesknap commented 1 year ago

If you would like to investigate further then I'd need the log file.