Closed karan closed 2 years ago
And, 10 mins after creating this issue it failed:
05-30 04:56:24 ERROR Request failed with status 429: b'{\n "error": {\n "code": 429,\n "message": "Quota exceeded for quota metric \'All requests\' and limit \'All requests per day\' of service \'photoslibrary.googleapis.com\' for consumer \'project_number:$PROJECT_NUM\'.",\n "status": "RESOURCE_EXHAUSTED",\n "details": [\n {\n "@type": "type.googleapis.com/google.rpc.ErrorInfo",\n "reason": "RATE_LIMIT_EXCEEDED",\n "domain": "googleapis.com",\n "metadata": {\n "consumer": "projects/$PROJECT_NUM",\n "quota_metric": "photoslibrary.googleapis.com/all_requests",\n "quota_limit": "ApiCallsPerProjectPerDay",\n "quota_location": "global",\n "service": "photoslibrary.googleapis.com",\n "quota_limit_value": "10000"\n }\n }\n ]\n }\n}\n'
05-30 04:56:24 ERROR
Process failed.
Traceback (most recent call last):
File "/root/.local/lib/python3.10/site-packages/gphotos_sync/Main.py", line 493, in main
self.start(args)
File "/root/.local/lib/python3.10/site-packages/gphotos_sync/Main.py", line 433, in start
self.do_sync(args)
File "/root/.local/lib/python3.10/site-packages/gphotos_sync/Main.py", line 397, in do_sync
self.google_photos_idx.index_photos_media()
File "/root/.local/lib/python3.10/site-packages/gphotos_sync/GooglePhotosIndex.py", line 148, in index_photos_media
items_json = self.search_media(
File "/root/.local/lib/python3.10/site-packages/gphotos_sync/GooglePhotosIndex.py", line 116, in search_media
return self._api.mediaItems.list.execute( # type: ignore
File "/root/.local/lib/python3.10/site-packages/gphotos_sync/restclient.py", line 106, in execute
result.raise_for_status()
File "/root/.local/lib/python3.10/site-packages/requests/models.py", line 960, in raise_for_status
raise HTTPError(http_error_msg, response=self)
requests.exceptions.HTTPError: 429 Client Error: Too Many Requests for url: https://photoslibrary.googleapis.com/v1/mediaItems?pageSize=100
Some more notes:
--threads 2
(default is 20 I think, so combined this should result in 40x fewer calls per 24 hours). Maybe that will get around the rate limit. Do you have other suggestions?Hi @karan thanks for the report.
This update includes more aggressive backoff when encountering 429 errors (API quota exceeded). Unfortunately it seems that its not enough to help for those with a decent connection!
I think the long delays without debug output are caused by this backoff happening in the urllib library but eventually giving up. Hence no chance for gphotos-sync to do any logging. We can probably turn on loggining in the URL layer but it would be pretty verbose.
So there are two levels of quota. A calls per second and then a max per day. My backoff strategy does help with the 1st, at least when I'm running multiple simultaneous downloads in the github actions - this is the only place I get to see the error since I only have 100MBits at home.
With the daily quota, I'm not sure how often this rolls over. If it is a daily rollover then there is no way to fix the issue except to wait for the next rollover time.
You can mitigate this by dropping the threadcount to 1. (-- threads 1)
I'm open to suggestions on how to cope with this, but like I said if there is a daily reset then there is not much that can be done. The graph you posted could imply this as there is only one spike per day. But You'd need to run a scan > 1 a day to prove this.
I just noticed that you are running once and hour. Now I'm not sure I understand the graph you posted. Have to think about this....
FYI: there is debug logging in the indexing. But when we make a call to the REST API it is a single call that continues to retry inside of the URL library so our debugging does not show.
It maybe that I have made this too aggressive as I think it can end up spending over an hour in the one call. I did this to maybe cope with the daily quota being exceeded - but that does not seem to have worked.
But wait. I have an answer for you:
you are using --flush-index.
This causes it to fully search the entire index of your library each time. Thus there are many many more API calls in the indexing phase.
If you leave this out and you use 1 thread you might be OK. The default behaviour is only to index from the date of the previous index.
If you really want to re index then just do it less times a day.
Yeah I suppose there isn't a way around the 24-hour rate limit. I can confirm though that now listing the files works after I let the script not run overnight (side note, the arg is --max-threads
and not threads
).
As for the logging, I think it would be useful to log within urllib at maybe trace level?
you are using --flush-index.
Yeah seems like I need to because I use --do-delete
.
It maybe that I have made this too aggressive as I think it can end up spending over an hour in the one call. I did this to maybe cope with the daily quota being exceeded - but that does not seem to have worked.
Maybe the backoff itself could log something? I had no idea what was going on with the script until about 30 mins later.
Re logging in the backoff - we need URLLIB logging for that as its a single call from my code. I'll look into it.
For do-delete, you could always schedule that only once a day and do a normal sync for the more frequent updates. You can schedule as often as you like because the lock file handles overlapping instances.
That's reasonable. I'm actually running the script twice a day now, and so far it looks okay.
Ubuntu 20.04.4.
$BASE
below is a local SSD, and$STORAGE
is an NFS mount.Not sure exactly when this started happening but it seems like I just can't get the script to get past indexing:
And command output:
I've let it run for 20+ minutes and it's just stuck.
I upgraded to the latest v3 image (and generated a token just fine), deleted the lock file, but the command output doesn't progress. I can't even get any more useful logs with
--log-level debug
.