instagram: --sleep-interval doesn't work when dowloading playlists

Vrihub commented 6 years ago

Make sure you are using the latest version: run `youtube-dl --version` and ensure your version is 2018.03.03. If it's not, read this FAQ entry and update. Issues with outdated version will be rejected.

[x] I've verified and I assure that I'm running youtube-dl 2018.03.03

Before submitting an issue make sure you have:

[x] At least skimmed through the README, most notably the FAQ and BUGS sections
[x] Searched the bugtracker for similar issues including closed ones
[x] Checked that provided video/audio/playlist URLs (if any) are alive and playable in a browser

What is the purpose of your issue?

[x] Bug report (encountered problems with youtube-dl)
[ ] Site support request (request for adding support for a new site)
[ ] Feature request (request for a new functionality)
[ ] Question
[ ] Other

Description of your issue, suggested solution and other information

The --sleep-interval option doesn't work when downloading a playlist using the instagram:user extractor. I often get the http 429 error (too many requests) from instagram in this phase. Maybe related to #4924?

Vrihub commented 6 years ago

If you could explain what makes this report "incomplete" I'll be glad to add more details.

dstftw commented 6 years ago

Issue template explains what makes it incomplete.

Vrihub commented 6 years ago

Sorry but I still can't understand. I've checked all the relevant boxes and I didn't include any verbose output because it's of no use for this issue: simply the --sleep-interval machinery is only used while downloading video files, not while downloading JSON data (as in the instagram:user extractor), so quick repeated requests made while downloading JSON data for the instagram user can trigger the server 429 http error (too many requests). I've actually tested that hard-wiring a time.sleep(1) into the extractor seems to fix the issue.

So I had a better look at the code: I think the code that implements min_sleep_interval in the download() method of the FileDownloader() class in downloader/common.py should be ported to _download_webpage() in extractor/common.py, so that it will also apply to http requests made to download JSON/webpage data. I guess it should be re-factored into a method to avoid code duplication, right?

If you agree this is the way to go, I can try to come up with a PR for this. (suggestions welcome!)

Vrihub commented 6 years ago

For the records, changes made in 27b1c73 to reflect the new instagram API also fix this issue, since now the instagram:user extractor downloads all the relevant JSON data in only one request. Well done!

ytdl-org / youtube-dl