mikf / gallery-dl

Command-line program to download image galleries and collections from several image hosting sites
GNU General Public License v2.0
10.68k stars 881 forks source link

V1.27.0 - Too Many Requests Error on Instagram #5677

Open alvvaysu opened 3 weeks ago

alvvaysu commented 3 weeks ago

With gallery-dl v1.27.0, downloading from Instagram often results in a 'too many requests' error, leading to warnings from Instagram about automation. However, with the same configuration file, v1.26.9 rarely encounters this 'too many requests' error.

Below is the config.json:

    "extractor": {
        "instagram": {
            "include": ["stories", "posts", "highlights", "tagged", "reels", "avatar"],
            "videos": true,
            "sleep-request": [9.0, 17.0],
            "sleep": [1.5, 5.0],
            "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/118.0.0.0 Safari/537.36"
        },
        "vk": {
            "user-agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/124.0.0.0 Safari/537.36 Edg/124.0.0.0",
            "include": ["albums", "photos", "tagged_photos"]
        }
    }
}
mikf commented 3 weeks ago

There weren't any changes to the Instagram code between v1.26.9 and v1.27.0.

The only thing I can think of is the new sleep-429 option. Try disabling it ("sleep-429": null) and see if that helps.

I'd also be interested to know if downgrading to v1.26.9 helps or if it's just Instagram that is now employing even stricter bot detection.

alvvaysu commented 3 weeks ago

Yes, I am currently using v1.26.9 and I’ve hardly seen any errors with Instagram URLs. I’ve created a Python script, and the key download command is:

command` = f'gallery-dl {filename_option} --write-metadata --retries 35 --write-log "{log_file_path}" --download-archive "{archive_file_path}" {directory_option} "{destination_path}" {url}'

Perhaps a choice that sets me apart from many others is that I used archive_file_path = os.path.join(destination_path, 'archive_file.dat'). It’s a “.dat” file. This was a suggestion given to me by AI before…

alvvaysu commented 3 weeks ago

To be precise, the errors with version 1.27.0 seem to be more like:

[instagram][error] HttpError: '400 Bad Request' for 'https://www.instagram.com/api/v1/users/web_profile_info/?username=*someone'

hashhar commented 4 days ago

I can confirm that this is a instagram side change. I've been running both 1.27.0 and 1.26.9 for some time in parallel and both run into this issue with similar frequency. It's more noticeable on the tagged posts though than any other URL.

alvvaysu commented 1 day ago

I'm not sure about the specific changes in gallery-dl. I can only access Instagram by proxy means in my area, but it seems that v1.27.1 is more prone to some blocking? That is, when downloading a certain picture on a certain person's page, it pauses for a long time

hashhar commented 1 day ago

it pauses for a long time

That is expected. Because it now retries rate-limit errors, earlier it used to just fail and did not retry at all. That doesn't mean the client is being blocked more often.