Richard-Weiss / Bing-Creator-Image-Downloader

Downloads all Bing Creator images from a collection
MIT License
39 stars 8 forks source link

Enhancement: Allow retrying of downloads with the clipboard method and optionally make retries configurable #34

Closed rc-gr closed 10 months ago

rc-gr commented 10 months ago

Currently, for the clipboard method, I noticed that a small portion of images would often fail to download. For example, in my last 3 consecutive runs of one of my collection of 157 images, I get these results:

First run (144/157):

2024-01-13 18:32:07,483 INFO Fetching metadata of images...
2024-01-13 18:32:09,041 ERROR Failed to get detailed information for image: {'image_set_id': '...885c', 'image_id': 'ExYd...'} for Reason: API response is missing data.
2024-01-13 18:32:09,173 ERROR Failed to get detailed information for image: {'image_set_id': '...67cc', 'image_id': '%2bcc...'} for Reason: API response is missing data.
2024-01-13 18:32:10,031 ERROR Failed to get detailed information for image: {'image_set_id': '...d3d9', 'image_id': 'g4A4...'} for Reason: API response is missing data.
2024-01-13 18:32:10,065 ERROR Failed to get detailed information for image: {'image_set_id': '...9d0d', 'image_id': 'kQG9...'} for Reason: API response is missing data.
2024-01-13 18:32:10,115 ERROR Failed to get detailed information for image: {'image_set_id': '...bbf5', 'image_id': 'TVL%2...'} for Reason: API response is missing data.
2024-01-13 18:32:10,155 ERROR Failed to get detailed information for image: {'image_set_id': '...b857', 'image_id': 'uFgm...'} for Reason: API response is missing data.
2024-01-13 18:32:10,188 ERROR Failed to get detailed information for image: {'image_set_id': '...b93a', 'image_id': 'JWXd...'} for Reason: API response is missing data.
2024-01-13 18:32:10,204 ERROR Failed to get detailed information for image: {'image_set_id': '...0a60', 'image_id': 'sDIV...'} for Reason: API response is missing data.
2024-01-13 18:32:10,253 ERROR Failed to get detailed information for image: {'image_set_id': '...4c1e', 'image_id': 'CpKh...'} for Reason: API response is missing data.
2024-01-13 18:32:10,276 ERROR Failed to get detailed information for image: {'image_set_id': '...e797', 'image_id': '3I8U...'} for Reason: API response is missing data.
2024-01-13 18:32:11,151 ERROR Failed to get detailed information for image: {'image_set_id': '...aa04', 'image_id': 'Rrwd...'} for Reason: API response is missing data.
2024-01-13 18:32:11,305 ERROR Failed to get detailed information for image: {'image_set_id': '...fc20', 'image_id': 'lRgX...'} for Reason: API response is missing data.
2024-01-13 18:32:11,707 ERROR Failed to get detailed information for image: {'image_set_id': '...a8e8', 'image_id': '6du8...'} for Reason: API response is missing data.
2024-01-13 18:32:14,118 INFO Starting download of 144 images.
...

Second run (150/157):

2024-01-13 18:32:39,399 INFO Fetching metadata of images...
2024-01-13 18:32:46,667 ERROR Failed to get detailed information for image: {'image_set_id': '...60c6', 'image_id': 'uM%2f...'} for Reason: API response is missing data.
2024-01-13 18:32:46,793 ERROR Failed to get detailed information for image: {'image_set_id': '...24fb', 'image_id': 'YZCo...'} for Reason: API response is missing data.
2024-01-13 18:32:47,106 ERROR Failed to get detailed information for image: {'image_set_id': '...296e', 'image_id': 'N8v8...'} for Reason: API response is missing data.
2024-01-13 18:32:49,198 ERROR Failed to get detailed information for image: {'image_set_id': '...aa5e', 'image_id': 'jIGo...'} for Reason: API response is missing data.
2024-01-13 18:32:49,945 ERROR Failed to get detailed information for image: {'image_set_id': '...02d9', 'image_id': 'Yfxu...'} for Reason: API response is missing data.
2024-01-13 18:32:50,189 ERROR Failed to get detailed information for image: {'image_set_id': '...eb25', 'image_id': 'CghK...'} for Reason: API response is missing data.
2024-01-13 18:32:50,232 ERROR Failed to get detailed information for image: {'image_set_id': '...aa04', 'image_id': 'Rrwd...'} for Reason: API response is missing data.
2024-01-13 18:32:51,791 INFO Starting download of 150 images.
...

Third run (147/157):

2024-01-13 18:51:14,532 INFO Fetching metadata of images...
2024-01-13 18:51:16,316 ERROR Failed to get detailed information for image: {'image_set_id': '...deff', 'image_id': 'AzG4...'} for Reason: API response is missing data.
2024-01-13 18:51:16,806 ERROR Failed to get detailed information for image: {'image_set_id': '...60f5', 'image_id': 'rwQd...'} for Reason: API response is missing data.
2024-01-13 18:51:16,891 ERROR Failed to get detailed information for image: {'image_set_id': '...e1fb', 'image_id': 'LByG...'} for Reason: API response is missing data.
2024-01-13 18:51:17,123 ERROR Failed to get detailed information for image: {'image_set_id': '...641d', 'image_id': 'wuZo...'} for Reason: API response is missing data.
2024-01-13 18:51:17,299 ERROR Failed to get detailed information for image: {'image_set_id': '...885c', 'image_id': 'ExYd...'} for Reason: API response is missing data.
2024-01-13 18:51:17,900 ERROR Failed to get detailed information for image: {'image_set_id': '...fcf3', 'image_id': 'iZnR...'} for Reason: API response is missing data.
2024-01-13 18:51:17,929 ERROR Failed to get detailed information for image: {'image_set_id': '...2dd9', 'image_id': 'AHa3...'} for Reason: API response is missing data.
2024-01-13 18:51:18,179 ERROR Failed to get detailed information for image: {'image_set_id': '...199f', 'image_id': 'zmX3...'} for Reason: API response is missing data.
2024-01-13 18:51:18,988 ERROR Failed to get detailed information for image: {'image_set_id': '...b2c1', 'image_id': 'znuX...'} for Reason: API response is missing data.
2024-01-13 18:51:19,116 ERROR Failed to get detailed information for image: {'image_set_id': '...e5ac', 'image_id': 'AoV%2...'} for Reason: API response is missing data.
2024-01-13 18:51:19,277 INFO Starting download of 147 images.
...

I believe incomplete runs like this happen often because even when trying to save the images normally to my collection on Edge, by using the "Save" button on the page of the image, I occasionally run into the "page not found" page instead of the usual (when the images have yet to expire): ImageCreatorPageNotFound

However, switching to the API method, with #33 implemented, the program managed to download all the images in one try:

2024-01-13 19:06:30,676 INFO Fetching metadata of collections...
2024-01-13 19:06:38,019 INFO Starting download of 157 images.
...
2024-01-13 19:06:50,112 INFO Successfully downloaded 157 of 157 images in 19.44 seconds.

What I'm suggesting here is to somehow allow the program to re-request the retrieval of the images whose responses are malformed like the above, whilst retaining their index (ie. the re-request should not change an image with eg. 0123 to the highest index + 1).

Then, for customizability based on each user's tolerance, also provide a way to configure the maximum number of retries to perform the re-request for each of those images in the config file.

Richard-Weiss commented 10 months ago

@rc-gr The reason it works with the collection API is, because it is able to fallback to the collection thumbnail. The clipboard method doesn't have it. Both use the detail API in the same way now through #33 and were basically too before that.
The request is attempted 3 times with an exponential timeout.
The detail API is just a bit finicky and doesn't handle large loads well. Because using aiohttp-retry returns the last request with any http code, you see these logs. Not sure how you could prevent that except for increasing the timeout even more.

Richard-Weiss commented 10 months ago

@rc-gr I've increased the max_timeout to 128 seconds and attempts to 8 for the detail API requests. Fetching the metadata using the file method for 1600 images takes 6 to 60 seconds for me and there are no errors even when doing it multiple times in a short time span.