ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
131.39k stars 9.96k forks source link

instagram: HTTP Error 403: Forbidden #16119

Closed Vrihub closed 6 years ago

Vrihub commented 6 years ago

Please follow the guide below


Make sure you are using the latest version: run youtube-dl --version and ensure your version is 2018.04.03. If it's not, read this FAQ entry and update. Issues with outdated version will be rejected.

Before submitting an issue make sure you have:

What is the purpose of your issue?


If the purpose of this issue is a bug report, site support request or you are not completely sure provide the full verbose output as follows:

Add the -v flag to your command line you run youtube-dl with (youtube-dl -v <your command line>), copy the whole output and insert it here. It should look similar to one below (replace it with your log inserted between triple ```):

...
[instagram:user] instagram: Downloading JSON metadata
[instagram:user] Saving request to instagram_https_-_www.instagram.com_instagram_a=1.dump
[download] Downloading playlist: instagram
[instagram:user] 25025320: Downloading JSON page 1
ERROR: Unable to download JSON metadata: HTTP Error 403: Forbidden (caused by <HTTPError 403: 'Forbidden'>); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
  File "/usr/local/lib/python3.5/dist-packages/youtube_dl/extractor/common.py", line 519, in _request_webpage
    return self._downloader.urlopen(url_or_request)
  File "/usr/local/lib/python3.5/dist-packages/youtube_dl/YoutubeDL.py", line 2199, in urlopen
    return self._opener.open(req, timeout=self._socket_timeout)
  File "/usr/lib/python3.5/urllib/request.py", line 472, in open
    response = meth(req, response)
  File "/usr/lib/python3.5/urllib/request.py", line 582, in http_response
    'http', request, response, code, msg, hdrs)
  File "/usr/lib/python3.5/urllib/request.py", line 510, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.5/urllib/request.py", line 444, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.5/urllib/request.py", line 590, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)

Description of your issue, suggested solution and other information

As of today, queries to https://instagram.com/graphql/query/... get a 403 error. Maybe related to recent restrictions in other parts of the instagram API? (https://www.instagram.com/developer/changelog/)

Hrxn commented 6 years ago

Damn... that API platform changelog..

I guess that's Facebook post Cambridge Analytica. Company policy changes went into full effect.

They now seem to want to avoid any possibilities of scraping like the literal plague. A bit ironic, considering that this is all still public data (This has not changed for private profiles). The world has officially gone crazy, but I guess most could already tell..

Hrxn commented 6 years ago

FYI, seems like this is not working again...

Vrihub commented 6 years ago

FYI, seems like this is not working again...

Yes, the cookie solution in ff826177 worked only for a couple of days, it seems they now refined the requirements for the cookies: see comments in https://github.com/rarcega/instagram-scraper/issues/205 and https://github.com/althonos/InstaLooter/issues/157#issuecomment-379468181

Vrihub commented 6 years ago

Maybe these pages have useful info about the required cookies:

https://kaijento.github.io/2017/05/17/web-scraping-instagram.com/ (search for "csrftoken Cookie")

https://github.com/siongui/instago#obtain-cookies

HTH

LaticePrime commented 6 years ago

Is this open again? Having same problem after update to 2018.04.09

Vrihub commented 6 years ago

Unfortunately, the fix in 315ab3d worked for a couple of days, and now it's HTTP Error 403: Forbidden again :(

Vrihub commented 6 years ago

This is getting silly, 403 again. Here is a suggested fix (tested ok): https://github.com/rarcega/instagram-scraper/issues/205#issuecomment-381866498

Vrihub commented 6 years ago

Thanks again for the fix. As usual, it worked for a while, and now... HTTP Error 400: Bad Request (for a change)