ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
131.39k stars 9.96k forks source link

instagram: --sleep-interval doesn't work when dowloading playlists #15771

Closed Vrihub closed 6 years ago

Vrihub commented 6 years ago

Make sure you are using the latest version: run youtube-dl --version and ensure your version is 2018.03.03. If it's not, read this FAQ entry and update. Issues with outdated version will be rejected.

Before submitting an issue make sure you have:

What is the purpose of your issue?


Description of your issue, suggested solution and other information

The --sleep-interval option doesn't work when downloading a playlist using the instagram:user extractor. I often get the http 429 error (too many requests) from instagram in this phase. Maybe related to #4924?

Vrihub commented 6 years ago

If you could explain what makes this report "incomplete" I'll be glad to add more details.

dstftw commented 6 years ago

Issue template explains what makes it incomplete.

Vrihub commented 6 years ago

Sorry but I still can't understand. I've checked all the relevant boxes and I didn't include any verbose output because it's of no use for this issue: simply the --sleep-interval machinery is only used while downloading video files, not while downloading JSON data (as in the instagram:user extractor), so quick repeated requests made while downloading JSON data for the instagram user can trigger the server 429 http error (too many requests). I've actually tested that hard-wiring a time.sleep(1) into the extractor seems to fix the issue.

So I had a better look at the code: I think the code that implements min_sleep_interval in the download() method of the FileDownloader() class in downloader/common.py should be ported to _download_webpage() in extractor/common.py, so that it will also apply to http requests made to download JSON/webpage data. I guess it should be re-factored into a method to avoid code duplication, right?

If you agree this is the way to go, I can try to come up with a PR for this. (suggestions welcome!)

Vrihub commented 6 years ago

For the records, changes made in 27b1c73 to reflect the new instagram API also fix this issue, since now the instagram:user extractor downloads all the relevant JSON data in only one request. Well done!