ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
132.55k stars 10.05k forks source link

404 error with tweet video #32007

Open jcsouz10 opened 1 year ago

jcsouz10 commented 1 year ago

Checklist

Verbose log

 ERROR: Unable to download JSON metadata: HTTP Error 404: Not Found (caused by <HTTPError 404: 'Not Found'>); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

Description

I'm trying to download metadata from tweets video and i'm getting 404 error. But yesterday was working very well

dirkf commented 1 year ago

Provide a verbose log. Read the FAQ section BUGS in the Manual if you're not sure how to generate one. Also, review other active issues for examples.

dirkf commented 1 year ago

Also, very often 404, even for some API URL rather than the page itself, is the site's way of telling us that the content is no longer available, so do check this again:

[x] I've checked that all provided URLs are alive and playable in a browser

PhobosK commented 1 year ago

I can confirm this problem...

But it seems it affects downloading of videos from accounts with explicit content. No matter if the video is explicit by itself, or not on per account bases no videos can be downloaded. Using the youtube-dl -u command line option and password, do not make any difference. Videos from other accounts that are not explicit can be downloaded without a problem.

Not sure if the debug log helps but here it is. BTW I chose some not so explicit video (it is an animation) from an explicit account and since not sure how things with such links should be treated/reported here, I masked the link in the log and will post under it the actual user and status, so you can construct the URL:

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-v', 'https://twitter.com/XXXXXX/status/XXXXXXXXXXX']
[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Python version 3.10.10 (CPython) - Linux-6.2.9-1-default-x86_64-with-glibc2.37
[debug] exe versions: avconv 12.3, avprobe 12.3, ffmpeg 4.4.3, ffprobe 4.4.3
[debug] Proxy map: {}
[twitter] XXXXXXXXXXX: Downloading guest token
[twitter] XXXXXXXXXXX: Downloading JSON metadata
ERROR: Unable to download JSON metadata: HTTP Error 404: Not Found (caused by <HTTPError 404: 'Not Found'>); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
  File "/usr/lib/python3.10/site-packages/youtube_dl/extractor/common.py", line 634, in _request_webpage
    return self._downloader.urlopen(url_or_request)
  File "/usr/lib/python3.10/site-packages/youtube_dl/YoutubeDL.py", line 2288, in urlopen
    return self._opener.open(req, timeout=self._socket_timeout)
  File "/usr/lib64/python3.10/urllib/request.py", line 525, in open
    response = meth(req, response)
  File "/usr/lib64/python3.10/urllib/request.py", line 634, in http_response
    response = self.parent.error(
  File "/usr/lib64/python3.10/urllib/request.py", line 563, in error
    return self._call_chain(*args)
  File "/usr/lib64/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/usr/lib64/python3.10/urllib/request.py", line 643, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)

The user is MommyLinaaa and the video is in status: 1644617705968791552

jcsouz10 commented 1 year ago

Provide a verbose log. Read the FAQ section BUGS in the Manual if you're not sure how to generate one. Also, review other active issues for examples.

Sorry for delay to answer..

My verbose log is exactly the same of Phobosk

jcsouz10 commented 1 year ago

I can confirm this problem...

Hmmm, great.. I didn't know.. do you have any solution for this ?

PhobosK commented 1 year ago

@jcsouz10 - Nope. No solution or workaround... I am not a developer here :) I guess @dirkf will chek this and help fixing it... now as he got more specific info on the issue :)

dirkf commented 1 year ago

Possibly Twitter is testing cookies against the UA or (worse) TLS fingerprint that was seen when the cookie was issued?

dirkf commented 1 year ago

I also received these comments:

setting a cookie with the --cookies cookies.txt option for an account with access to the content doesn't seem to help either.

cookies.txt: twitter.com FALSE / TRUE 1696528758 auth_token [auth_token_value] twitter.com FALSE / TRUE 1696528726 guest_id [guest_id_value]

And:

On line youtube_dl/YoutubeDL.py#2303 https://github.com/ytdl-org/youtube-dl/blob/213d1d91bfc4a00fefc72fa2730555d51060b42d/youtube_dl/YoutubeDL.py#L2303

|url = req.get_full_url()| is returning |'https://api.twitter.com/1.1/guest/activate.json'| rather than the original request url |https://twitter.com/MommyLinaaa/status/1644617705968791552|

In incognito, it seems the post requires age-verification:

As mentioned, i also get 'https://api.twitter.com/1.1/guest/activate.json' with --cookies cookies.txt option set using an |auth_token| that logs me in either set manually from the dev console or imported using the cookies.txt file directly with the EditThisCookie extension (i.e. - I can see the post and not the age-verification page). However, for some reason it doesn't get past the age-verification though when using the --cookies cookies.txt option in youtube-dl.

This explains that the problem occurs when yt-dl fetches a page that has some content flag and the page request redirects to some user verification page (age verification?). There is also some trickery where the authorisation token from that page may need some other headers set to be valid.

dirkf commented 1 year ago

I also received these comments:

setting a cookie with the --cookies cookies.txt option for an account with access to the content doesn't seem to help either.

cookies.txt: twitter.com FALSE / TRUE 1696528758 auth_token [auth_token_value] twitter.com FALSE / TRUE 1696528726 guest_id [guest_id_value]

And:

On line youtube_dl/YoutubeDL.py#2303 https://github.com/ytdl-org/youtube-dl/blob/213d1d91bfc4a00fefc72fa2730555d51060b42d/youtube_dl/YoutubeDL.py#L2303

url = req.get_full_url() is returning https://api.twitter.com/1.1/guest/activate.json rather than the original request url https://twitter.com/MommyLinaaa/status/1644617705968791552

In incognito, it seems the post requires age-verification:

As mentioned, i also get 'https://api.twitter.com/1.1/guest/activate.json' with --cookies cookies.txt option set using an auth_token that logs me in either set manually from the dev console or imported using the cookies.txt file directly with the EditThisCookie extension (i.e. - I can see the post and not the age-verification page). However, for some reason it doesn't get past the age-verification though when using the --cookies cookies.txt option in youtube-dl.

This explains that the problem occurs when yt-dl fetches a page that has some content flag and the page request redirects to some user verification page (age verification?). There is also some trickery where the authorisation token from that page may need some other headers set to be valid.

dirkf commented 1 year ago

Probably related: https://github.com/yt-dlp/yt-dlp/issues/5998

dirkf commented 1 year ago

Actually related: https://github.com/yt-dlp/yt-dlp/issues/6763

jasonm23 commented 1 year ago

Use option

--cookies-from-browser $BROWSER_NAME

should fix it

PhobosK commented 1 year ago

Use option

--cookies-from-browser $BROWSER_NAME

should fix it

@jasonm23 , you are talking about the youtube-dl fork yt-dlp There is no such option in the currently available stable version of youtube-dl 2021.12.17 So I do not see how this is relevant here !

BTW @dirkf , just as an additional info debugging - since my account has 2FA enabled, I tried with it also, but the result is the same:

[debug] Command-line args: ['-v', '-u', 'PRIVATE', '-2', 'XXXXXX', 'https://twitter.com/XXXXXXXXXXX/status/XXXXX']
...............
...............
ERROR: Unable to download JSON metadata: HTTP Error 404: Not Found (caused by <HTTPError 404: 'Not Found'>); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
Gokujo commented 1 year ago

I tried with cookies but I still get 404 Error

[debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8
[debug] youtube-dl version 2021.12.17
[debug] Python version 3.10.6 (CPython) - Linux-6.2.6-76060206-generic-x86_64-with-glibc2.35
[debug] exe versions: ffmpeg 4.4.2, ffprobe 4.4.2
[debug] Proxy map: {}
[twitter] 1614746772974501896: Downloading guest token
[twitter] 1614746772974501896: Downloading JSON metadata
ERROR: Unable to download JSON metadata: HTTP Error 404: Not Found (caused by <HTTPError 404: 'Not Found'>); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
  File "/home/maximharder/Development/Private/Python/env/botPosterEnv310/lib/python3.10/site-packages/youtube_dl/extractor/common.py", line 634, in _request_webpage
    return self._downloader.urlopen(url_or_request)
  File "/home/maximharder/Development/Private/Python/env/botPosterEnv310/lib/python3.10/site-packages/youtube_dl/YoutubeDL.py", line 2288, in urlopen
    return self._opener.open(req, timeout=self._socket_timeout)
  File "/usr/lib/python3.10/urllib/request.py", line 525, in open
    response = meth(req, response)
  File "/usr/lib/python3.10/urllib/request.py", line 634, in http_response
    response = self.parent.error(
  File "/usr/lib/python3.10/urllib/request.py", line 563, in error
    return self._call_chain(*args)
  File "/usr/lib/python3.10/urllib/request.py", line 496, in _call_chain
    result = func(*args)
  File "/usr/lib/python3.10/urllib/request.py", line 643, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)

options that I am giving to downloader:

                ydl_opts = {
                    'format': 'bestaudio/best',
                    'outtmpl': file_path,
                    'noplaylist': True,
                    'cookies': TWITTER_COOKIES,
                    'verbose': True,
                    'continue': True,
                    'dump-user-agent': True,
                }

maybe I am doing sth wrong?

Blood-Turd commented 1 year ago

Use option

--cookies-from-browser $BROWSER_NAME

should fix it

Thanks, this fixed it for me.

jfernandz commented 1 year ago

I cannot use --cookies-from-browser because of according my youtube-dl version: no such option

jasonm23 commented 1 year ago

Update to the latest yt-dlp