gilesknap / gphotos-sync

Google Photos and Albums backup with Google Photos Library API
Apache License 2.0
2.03k stars 167 forks source link

No longer downloading new content unless "--rescan" #230

Closed tdimarzio closed 1 year ago

tdimarzio commented 4 years ago

I guess this started happening the past week or so. Gphotos-sync has been running happily for months via a daily docker job. Never any issues. Now, all of the sudden, it's not downloading new photos unless I run with --rescan.

Here is the most recent output.

Mon May 4 04:00:10 UTC 2020: Starting SSH daemon ...

... and now, with a --rescan, it brings in the missing photos from yesterday:

root@docker_ubuntu:~/gphotos-sync# pipenv run gphotos-sync /gphotos-sync --rescan 05-04 16:59:27 WARNING gphotos-sync 2.14.0 2020-05-04 16:59:27.803469 05-04 16:59:28 WARNING Indexing Google Photos Files ... 05-04 17:09:15 WARNING indexed 15 items 05-04 17:09:18 WARNING Downloading Photos ... 05-04 17:09:19 WARNING Null response in mediaItems.batchGet dict_keys(['##really_long_string##', '##really_long_string##', '##really_long_string##', '##really_long_string##', '##really_long_string##', '##really_long_string##', '##really_long_string##', '##really_long_string##', '##really_long_string##', '##really_long_string##', '##really_long_string##', '##really_long_string##', '##really_long_string##', '##really_long_string##', '##really_long_string##', '##really_long_string##']) 05-04 17:09:28 WARNING Downloaded 15 Items, Failed 0, Already Downloaded 57972 05-04 17:09:28 WARNING Indexing Shared (titled) Albums ... 05-04 17:13:25 WARNING Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='photoslibrary.googleapis.com', port=443): Read timed out. (read timeout=10)",)': /v1/mediaItems:search 05-04 17:15:46 WARNING Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='photoslibrary.googleapis.com', port=443): Read timed out. (read timeout=10)",)': /v1/mediaItems:search 05-04 17:19:42 WARNING Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='photoslibrary.googleapis.com', port=443): Read timed out. (read timeout=10)",)': /v1/mediaItems:search 05-04 17:20:10 WARNING Indexed 516 Shared (titled) Albums 05-04 17:20:10 WARNING Indexing Albums ... 05-04 17:20:33 WARNING Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='photoslibrary.googleapis.com', port=443): Read timed out. (read timeout=10)",)': /v1/mediaItems:search 05-04 17:23:37 WARNING Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='photoslibrary.googleapis.com', port=443): Read timed out. (read timeout=10)",)': /v1/mediaItems:search 05-04 17:27:51 WARNING Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='photoslibrary.googleapis.com', port=443): Read timed out. (read timeout=10)",)': /v1/albums?pageToken=##really_long_string##&pageSize=50 05-04 17:41:49 WARNING Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='photoslibrary.googleapis.com', port=443): Read timed out. (read timeout=10)",)': /v1/albums?pageToken=##really_long_string##&pageSize=50 05-04 17:41:59 WARNING Retrying (Retry(total=3, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='photoslibrary.googleapis.com', port=443): Read timed out. (read timeout=10)",)': /v1/albums?pageToken=##really_long_string##&pageSize=50 05-04 17:43:19 WARNING Indexed 611 Albums 05-04 17:43:21 WARNING Downloading Photos ... 05-04 17:43:22 WARNING Null response in mediaItems.batchGet dict_keys(['##really_long_string##']) 05-04 17:43:22 WARNING Downloaded 15 Items, Failed 0, Already Downloaded 57987 05-04 17:43:22 WARNING Creating album folder links to media ... 05-04 17:45:09 ERROR bad link to /gphotos-sync/photos/2008/08/IMG_1218.JPG 05-04 17:46:17 ERROR bad link to /gphotos-sync/photos/2015/12/VID_20151213_184828.mp4 05-04 17:46:17 ERROR bad link to /gphotos-sync/photos/2015/12/VID_20151213_185041.mp4 05-04 17:46:18 WARNING Created 53749 new album folder links 05-04 17:46:18 WARNING Done.

gilesknap commented 4 years ago

The dreaded Null Response in batchGet was an old issue with the Google API which was fixed but appears to have resurfaced. It means you have one corrupted item in your library and because I use batches to download it is affecting other items.

This was already reported and I have been working on restoring the old fix. If you would like to test it for me then please try the branch 'restore-bad-ids' and let me know the results. I'm pretty sure this is working, but its failing one of the CI tests on Windows, but this is only to do with the --archive feature.

You should end up with a file in the root of your backup called bad_ids.json, it will list the items that failed to download and will not try to download them again. The file contains links to the PhotosWeb items so you can see if they are available through a browser.

Thanks, giles.

tdimarzio commented 4 years ago

Hi Giles. Just ran a fresh --rescan using the restore-bad-ids branch and the output is definitely cleaner, though I don't have a bad_ids.json in the root of the backup: root@docker_ubuntu:~/gphotos-sync# pipenv run gphotos-sync /gphotos-sync --rescan 05-05 01:28:48 WARNING gphotos-sync 2.14.1b1+22.gc8b4c48.dirty 2020-05-05 01:28:48.046436 05-05 01:28:48 WARNING Indexing Google Photos Files ... 05-05 01:40:53 WARNING indexed 1 items 05-05 01:40:54 WARNING Downloading Photos ... 05-05 01:40:55 WARNING Null response in mediaItems.batchGet dict_keys(['AAg2MJGcPI5j-1GLM5D-kCBV6Y75dU0iJi8KZm-JOUjTAISy3_COuztQUYFHxhSH56mboTGwle4xlExwF9sscMKFkDShJ7rmmw', 'AAg2MJHr-4OcGO7k52N1jcGwQrtP32UyCNtM1nwZOz9a4BulaW9AhhjiD_kOZB424Fl6ehJqA07auqF6ESHlt4HtTcjUy2dTzA']) 05-05 01:40:57 WARNING Downloaded 1 Items, Failed 0, Already Downloaded 57987 05-05 01:40:57 WARNING Indexing Shared (titled) Albums ... 05-05 01:45:18 WARNING Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='photoslibrary.googleapis.com', port=443): Read timed out. (read timeout=10)",)': /v1/mediaItems:search 05-05 01:51:19 WARNING Indexed 516 Shared (titled) Albums 05-05 01:51:19 WARNING Indexing Albums ... 05-05 02:00:19 WARNING Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='photoslibrary.googleapis.com', port=443): Read timed out. (read timeout=10)",)': /v1/mediaItems:search 05-05 02:13:03 WARNING Indexed 611 Albums 05-05 02:13:03 WARNING Downloading Photos ... 05-05 02:13:03 WARNING Null response in mediaItems.batchGet dict_keys(['AAg2MJGcPI5j-1GLM5D-kCBV6Y75dU0iJi8KZm-JOUjTAISy3_COuztQUYFHxhSH56mboTGwle4xlExwF9sscMKFkDShJ7rmmw']) 05-05 02:13:03 WARNING Downloaded 1 Items, Failed 0, Already Downloaded 57988 05-05 02:13:04 WARNING Creating album folder links to media ... 05-05 02:14:49 ERROR bad link to /gphotos-sync/photos/2008/08/IMG_1218.JPG 05-05 02:15:30 ERROR bad link to /gphotos-sync/photos/2015/12/VID_20151213_184828.mp4 05-05 02:15:30 ERROR bad link to /gphotos-sync/photos/2015/12/VID_20151213_185041.mp4 05-05 02:15:30 WARNING Created 53749 new album folder links 05-05 02:15:30 WARNING Done.

... will see if the issue of new content not being found/downloaded is now gone. Thanks - Tony

gilesknap commented 4 years ago

The file should be called gphotos.bad_ids.yaml in fact. But the log shows 0 failed items so I would not expect it to be there. This may be a different issue to the one that the bad ids code is trying to fix.

I'll investigate further when you report back.

tdimarzio commented 4 years ago

The file gphotos.bad_ids.yaml is present, but it is empty (just contains two curly braces). I can confirm the issue that I am having is still present. It is not downloading any new content. Here is the most recent run, from restore-bad-ids branch: 05-05 15:57:54 WARNING gphotos-sync 2.14.1b1+22.gc8b4c48.dirty 2020-05-05 15:57:54.871930 05-05 15:57:55 WARNING Indexing Google Photos Files ... 05-05 15:57:56 WARNING indexed 0 items 05-05 15:58:21 WARNING Downloading Photos ... 05-05 15:58:22 WARNING Null response in mediaItems.batchGet dict_keys(['AAg2MJGcPI5j-1GLM5D-kCBV6Y75dU0iJi8KZm-JOUjTAISy3_COuztQUYFHxhSH56mboTGwle4xlExwF9sscMKFkDShJ7rmmw']) 05-05 15:58:22 WARNING Downloaded 0 Items, Failed 0, Already Downloaded 57988 05-05 15:58:22 WARNING Done.

It should have downloaded 1 new picture. I'm sure if I run a --rescan it would get it though. What can I provide to help troubleshoot this further? Thanks - Tony

gilesknap commented 4 years ago

please can you run with --loglevel TRACE and send me the log file and the trace file. You can send direct to gilesknap@gmail.com if you pefer.

Does the failure to download persist after two days? I'm wondering if this is a timezone issue, also let me know which timezone you are in, please?

tdimarzio commented 4 years ago

I just emailed you the trace output from this command (pipenv run gphotos-sync /gphotos-sync --log-level TRACE --logfile /tmp/gphotos-sync_trace.log). Regarding the failure persisting for two days, I will try again tomorrow and let you know. I am in EDT, New York.

gilesknap commented 4 years ago

Thanks, I'll look at this ASAP, but it might have to wait to the weekend.

tdimarzio commented 4 years ago

No worries! Much appreciated in advance!

tdimarzio commented 4 years ago

Hi Giles. With the last nightly / scheduled execution of gphotos-sync (still using restore-bad-ids branch) it did identify and download a new file. So, perhaps this is indeed related to some time delay. I just took another test photo and will see if it comes through tonight, or the following night. Will update here.

gilesknap commented 4 years ago

That is useful info. Please can you report the file date and time of the backup of the photo and also the date and time reported by Google when you open it in Google photos Web.

tdimarzio commented 4 years ago

Date/time for download (Thurs, May 7, 12:00 AM EDT): Thu May 7 04:00:16 UTC 2020: Running gphotos-sync ... 05-07 04:00:17 WARNING gphotos-sync 2.14.1b1+22.gc8b4c48.dirty 2020-05-07 04:00:17.443524 05-07 04:00:17 WARNING Indexing Google Photos Files ... 05-07 04:00:19 WARNING indexed 1 items 05-07 04:00:46 WARNING Downloading Photos ... 05-07 04:00:47 WARNING Null response in mediaItems.batchGet dict_keys(['#removed#', '#removed#']) 05-07 04:00:48 WARNING Downloaded 1 Items, Failed 0, Already Downloaded 57988

Date/time from Google Photos Web: Monday, May 4, 2020, 10:47 PM (EDT)

Date/time in file system after gphotos-sync pulled the file down: Monday, May 4, 2020, 10:47 PM (EDT)

The previous gphotos-sync run at 12:00 AM EDT on May 6 did not pick it up. So, between May 6 midnight and May 7 midnight, something changed.

tdimarzio commented 4 years ago

Latest update: Last night's execution behaved perfectly normally. All 6 items (1 photo I took, 1 video I took, 4 pictures pulled in from shared library) downloaded: 05-08 04:00:17 WARNING gphotos-sync 2.14.1b1+22.gc8b4c48.dirty 2020-05-08 04:00:17.673604 05-08 04:00:18 WARNING Indexing Google Photos Files ... 05-08 04:00:21 WARNING indexed 6 items 05-08 04:00:49 WARNING Downloading Photos ... 05-08 04:00:50 WARNING Null response in mediaItems.batchGet dict_keys ##truncated for readability## 05-08 04:00:54 WARNING Downloaded 6 Items, Failed 0, Already Downloaded 57989 05-08 04:00:54 WARNING Indexing Shared (titled) Albums ... 05-08 04:01:13 WARNING Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='photoslibrary.googleapis.com', port=443): Read timed out. (read timeout=10)",)': /v1/sharedAlbums?pageSize=50&pageToken=##truncated for readability## 05-08 04:01:32 WARNING Indexed 517 Shared (titled) Albums 05-08 04:01:32 WARNING Indexing Albums ... 05-08 04:02:06 WARNING Indexed 611 Albums 05-08 04:02:07 WARNING Downloading Photos ... 05-08 04:02:07 WARNING Null response in mediaItems.batchGet dict_keys(['##long_string##']) 05-08 04:02:07 WARNING Downloaded 6 Items, Failed 0, Already Downloaded 57995 05-08 04:02:07 WARNING Creating album folder links to media ... 05-08 04:04:14 ERROR bad link to /gphotos-sync/photos/2008/08/IMG_1218.JPG 05-08 04:04:59 ERROR bad link to /gphotos-sync/photos/2015/12/VID_20151213_184828.mp4 05-08 04:04:59 ERROR bad link to /gphotos-sync/photos/2015/12/VID_20151213_185041.mp4 05-08 04:05:00 WARNING Created 53749 new album folder links 05-08 04:05:00 WARNING Done. ... I've changed nothing. So, not sure why it's behaving now.

gilesknap commented 4 years ago

This still may be timezone related, did you sync at a different time of day to previously?

On Fri, 8 May 2020 at 17:29, tdimarzio notifications@github.com wrote:

Latest update: Last night's execution behaved perfectly normally. All 6 items (1 photo I took, 1 video I took, 4 pictures pulled in from shared library) downloaded: 05-08 04:00:17 WARNING gphotos-sync 2.14.1b1+22.gc8b4c48.dirty 2020-05-08 04:00:17.673604 05-08 04:00:18 WARNING Indexing Google Photos Files ... 05-08 04:00:21 WARNING indexed 6 items 05-08 04:00:49 WARNING Downloading Photos ... 05-08 04:00:50 WARNING Null response in mediaItems.batchGet dict_keys

truncated for readability

05-08 04:00:54 WARNING Downloaded 6 Items, Failed 0, Already Downloaded 57989 05-08 04:00:54 WARNING Indexing Shared (titled) Albums ... 05-08 04:01:13 WARNING Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='photoslibrary.googleapis.com', port=443): Read timed out. (read timeout=10)",)': /v1/sharedAlbums?pageSize=50&pageToken=##truncated for readability## 05-08 04:01:32 WARNING Indexed 517 Shared (titled) Albums 05-08 04:01:32 WARNING Indexing Albums ... 05-08 04:02:06 WARNING Indexed 611 Albums 05-08 04:02:07 WARNING Downloading Photos ... 05-08 04:02:07 WARNING Null response in mediaItems.batchGet dict_keys(['##long_string##']) 05-08 04:02:07 WARNING Downloaded 6 Items, Failed 0, Already Downloaded 57995 05-08 04:02:07 WARNING Creating album folder links to media ... 05-08 04:04:14 ERROR bad link to /gphotos-sync/photos/2008/08/IMG_1218.JPG 05-08 04:04:59 ERROR bad link to /gphotos-sync/photos/2015/12/VID_20151213_184828.mp4 05-08 04:04:59 ERROR bad link to /gphotos-sync/photos/2015/12/VID_20151213_185041.mp4 05-08 04:05:00 WARNING Created 53749 new album folder links 05-08 04:05:00 WARNING Done. ... I've changed nothing. So, not sure why it's behaving now.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/gilesknap/gphotos-sync/issues/230#issuecomment-625898838, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHLRW2Z54BQ74IOTL23H5DRQQXQBANCNFSM4MY6P3HQ .

tdimarzio commented 4 years ago

Nope. The script runs every night at the same time via a fresh docker execution.

tdimarzio commented 4 years ago

Tonight's nightly execution succeeded as well. It's as if the problem magically disappeared. If I can't reproduce the issue any more I guess it will be pretty hard to root-cause, until/unless it returns.

gilesknap commented 4 years ago

I've taken a look at the info you sent.

It seems we do not have a timezone issue, which is interesting because I have always assumed that there would be one and have never got around to looking into it. So thanks for clearing that up.

You do have one corrupted item in your library and that is causing the null media item message. It is happening at a different place to where the 'BadIds' code operates which is why it is not appearing in the bad ids file. There is a good chance that this corrupt item was causing the other issue that went away.

We can have a quick go at finding and maybe removing this item with a bit of sql.

You will need to install sqlite3 on your system and then go to your photos download folder. Then:


[giles@gklinux GilesPhotos]$ sqlite3 gphotos.sqlite 
SQLite version 3.31.1 2020-01-27 19:55:54
Enter ".help" for usage hints.
sqlite> select Path, FileName, CreateDate, Url from SyncFiles where RemoteId = "AAg2MJGcPI5j-1GLM5D-kCBV6Y75dU0iJi8KZm-JOUjTAISy3_COuztQUYFHxhSH56mboTGwle4xlExwF9sscMKFkDShJ7rmmw";

This should tell you the name of the offending file and give a URL linking to the entry in your photos library. I have before seen this and been able to link to the item which shows as having an error and then delete it via the photos API.

tdimarzio commented 4 years ago

This was the output of sqlite; root@docker_ubuntu:/gphotos-sync# sqlite3 gphotos.sqlite SQLite version 3.22.0 2018-01-22 18:45:57 Enter ".help" for usage hints. sqlite> select Path, FileName, CreateDate, Url from SyncFiles where RemoteId = "AAg2MJGcPI5j-1GLM5D-kCBV6Y75dU0iJi8KZm-JOUjTAISy3_COuztQUYFHxhSH56mboTGwle4xlExwF9sscMKFkDShJ7rmmw";

photos/2020/05|IMG_20200502_212322.jpg|2020-05-03 01:23:22|https://photos.google.com/lr/photo/AAg2MJGcPI5j-1GLM5D-kCBV6Y75dU0iJi8KZm-JOUjTAISy3_COuztQUYFHxhSH56mboTGwle4xlExwF9sscMKFkDShJ7rmmw

... I cannot find any photo in my Google Photos library taken at 2020-05-03 01:23:22.

gilesknap commented 4 years ago

try going to the URL that you have pasted above it should take you to google photos and open the offending item.

On Sat, 9 May 2020 at 17:50, tdimarzio notifications@github.com wrote:

This was the output of sqlite; root@docker_ubuntu:/gphotos-sync# sqlite3 gphotos.sqlite SQLite version 3.22.0 2018-01-22 18:45:57 Enter ".help" for usage hints. sqlite> select Path, FileName, CreateDate, Url from SyncFiles where RemoteId = "AAg2MJGcPI5j-1GLM5D-kCBV6Y75dU0iJi8KZm-JOUjTAISy3_COuztQUYFHxhSH56mboTGwle4xlExwF9sscMKFkDShJ7rmmw";

photos/2020/05|IMG_20200502_212322.jpg|2020-05-03 01:23:22| https://photos.google.com/lr/photo/AAg2MJGcPI5j-1GLM5D-kCBV6Y75dU0iJi8KZm-JOUjTAISy3_COuztQUYFHxhSH56mboTGwle4xlExwF9sscMKFkDShJ7rmmw

... I cannot find any photo in my Google Photos library taken at 2020-05-03 01:23:22.

— You are receiving this because you were assigned. Reply to this email directly, view it on GitHub https://github.com/gilesknap/gphotos-sync/issues/230#issuecomment-626204500, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAHLRW2FL645WFXE75KVSNLRQWCUXANCNFSM4MY6P3HQ .

tdimarzio commented 4 years ago

Unfortunately, the link simply gives me a Google error page "Can’t access photo This photo doesn’t exist or your current account doesn’t have permission to view it."

gilesknap commented 4 years ago

In that case I'm afraid you are stuck with it. But at least the affects are now benign.

I'll close the issue but feel free to continue the thread if you have any further questions.

tdimarzio commented 4 years ago

Thanks, Giles. Yes, at least the error is mostly noise at this point and, after what amounted to a few day hiatus, the nightly gphotos-sync is working well again. Thanks again.

JPnux commented 1 year ago

Hello I have a similar problem. No longer downloading new content unless "--rescan"

I used the script via Docker for months without any issues.

I wanted to start over since I reorganized my Google Photos by deleting everything and reuploading my photos. I deleted the gphotos-sync files except for the secret file.

I reinstalled the Docker container.

docker run -d \ --name="gphotos-sync" \ -v /volume1/docker/gphotos-sync/:/config:rw \ -v /volume1/Photos/:/storage:rw \ gilesknap/gphotos-sync:latest \ /storage

I launched the container and it indexed the files properly.

But now when I launch the container just with /storage as an argument, it doesn't see the new photos. I have to do a --rescan for it to work.

Strange.

Was deleting the gphotos files... a good way to start over from scratch?

Docker with Synology DSM 7.2.

A made a screenshot of the logs : there is no error but gphotos-sync doesn't see the new photos uploaded after the first scan at 14:15...

image

Thank you

gilesknap commented 1 year ago

@JPnux can you run the above again and try adding the option --log-level trace and send me the output, plus the file ending in .trace in your root folder.

JPnux commented 1 year ago

@gilesknap Last night gphotos-sync ran normally (I had scheduled a task) and it downloaded the added photo...

Maybe there is a minimum delay that needs to be respected between when a photo is added to Google Photos and when it becomes visible via the API?

I will do another test. I will add a photo to Google Photos this morning and we'll see if tonight gphotos-sync uploads it.

gilesknap commented 1 year ago

@JPnux perhaps you are experiencing the same issue as the OP above.

JPnux commented 1 year ago

it's working normally image

gilesknap commented 1 year ago

OK thanks for letting me know. I can't guess what is going on other than to say transient issues are probably related to Google servers.

Closing for now. Feel free to post here again if issues re-surface.