ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
131.92k stars 10k forks source link

youtube.com 404 error when downloading subtitle. #28802

Open ballsystemlord opened 3 years ago

ballsystemlord commented 3 years ago

Checklist

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--socket-timeout', '60', '-i', '--no-call-home', '--restrict-filenames', '--no-playlist', '--hls-prefer-native', '-f', '22,20,18', '--write-auto-sub', '--write-description', '--user-agent', 'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:56.0) Gecko/20100101 Firefox/56.0', '--cookies', '/mnt/md7/to-go-through/cookies.txt', '--verbose', 'https://www.youtube.com/watch?v=rXUafEqpTvU']
[debug] Encodings: locale ANSI_X3.4-1968, fs ascii, out ANSI_X3.4-1968, pref ANSI_X3.4-1968
[debug] youtube-dl version 2021.04.17
[debug] Python version 3.5.3 (CPython) - Linux-5.11.12-nopreempt-Radeon-SI-dav11-x86_64-with-debian-9
[debug] exe versions: ffmpeg 3.2.15-0, ffprobe 3.2.15-0, rtmpdump 2.4
[debug] Proxy map: {}
[youtube] rXUafEqpTvU: Downloading webpage
[youtube] Downloading just video rXUafEqpTvU because of --no-playlist
[youtube] rXUafEqpTvU: Downloading MPD manifest
[info] rXUafEqpTvU: downloading video in 2 formats
[info] Writing video description to: You_can_-j_REJECT_but_you_can_not_hide_-_Global_scanning_of_the_IPv6_Internet_33c3_-_deutsche_Ubers-rXUafEqpTvU.description
[info] Writing video subtitles to: You_can_-j_REJECT_but_you_can_not_hide_-_Global_scanning_of_the_IPv6_Internet_33c3_-_deutsche_Ubers-rXUafEqpTvU.en.vtt
WARNING: Unable to download subtitle for "en": Unable to download webpage: HTTP Error 404: Not Found (caused by <HTTPError 404: 'Not Found'>); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
[debug] Invoking downloader on 'https://r3---sn-5uaeznde.googlevideo.com/videoplayback?expire=1618867984&ei=sKJ9YP3qMI7tgQfstowQ&ip=71.208.112.225&id=o-AFl_MhTlGyV7aKDdCVsLLX4MtjtzTXZAFQp5-jDWKCsm&itag=22&source=youtube&requiressl=yes&mh=LK&mm=31&mn=sn-5uaeznde&ms=au&mv=m&mvi=3&pl=18&initcwndbps=908750&vprv=1&svpuc=1&mime=video%2Fmp4&ns=dYxIDrLEgjg5HrRo3XSeQ1cF&ratebypass=yes&dur=2779.126&lmt=1482936511576053&mt=1618846144&fexp=24001373%2C24007246&c=WEB&n=Fvf51wF5Xdb9rj&sparams=expire%2Cei%2Cip%2Cid%2Citag%2Csource%2Crequiressl%2Cvprv%2Csvpuc%2Cmime%2Cns%2Cratebypass%2Cdur%2Clmt&sig=AOq0QJ8wRQIgcLLQN634M-ZXYDApnzMpf7hbXSnshIC-O7lhZZ78BFQCIQDw3jL9svdaWEU8WICZ6_E1RWmvG2viv73IDS-uiQyJXA%3D%3D&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AG3C_xAwRgIhAImcGXI7MTFPqyNdXpKcdZfO-Ae8vtiJO8xxr5wuNkf_AiEA34iaOv0x98aqAtCPpnsAfNcMb1VK5gDk3jGocxKrXtM%3D'
[download] You_can_-j_REJECT_but_you_can_not_hide_-_Global_scanning_of_the_IPv6_Internet_33c3_-_deutsche_Ubers-rXUafEqpTvU.mp4 has already been downloaded

[download] 100% of 289.24MiB
[info] Writing video description to: You_can_-j_REJECT_but_you_can_not_hide_-_Global_scanning_of_the_IPv6_Internet_33c3_-_deutsche_Ubers-rXUafEqpTvU.description
[info] Writing video subtitles to: You_can_-j_REJECT_but_you_can_not_hide_-_Global_scanning_of_the_IPv6_Internet_33c3_-_deutsche_Ubers-rXUafEqpTvU.en.vtt
WARNING: Unable to download subtitle for "en": Unable to download webpage: HTTP Error 404: Not Found (caused by <HTTPError 404: 'Not Found'>); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
[debug] Invoking downloader on 'https://r3---sn-5uaeznde.googlevideo.com/videoplayback?expire=1618867984&ei=sKJ9YP3qMI7tgQfstowQ&ip=71.208.112.225&id=o-AFl_MhTlGyV7aKDdCVsLLX4MtjtzTXZAFQp5-jDWKCsm&itag=18&source=youtube&requiressl=yes&mh=LK&mm=31&mn=sn-5uaeznde&ms=au&mv=m&mvi=3&pl=18&initcwndbps=908750&vprv=1&svpuc=1&mime=video%2Fmp4&ns=dYxIDrLEgjg5HrRo3XSeQ1cF&gir=yes&clen=97952227&ratebypass=yes&dur=2779.126&lmt=1482936705634151&mt=1618846144&fexp=24001373%2C24007246&c=WEB&n=Fvf51wF5Xdb9rj&sparams=expire%2Cei%2Cip%2Cid%2Citag%2Csource%2Crequiressl%2Cvprv%2Csvpuc%2Cmime%2Cns%2Cgir%2Cclen%2Cratebypass%2Cdur%2Clmt&sig=AOq0QJ8wRAIgQ-LjNbYDq8p0g02nikvayQO3uO9yuPhYqzi4B17t_H4CIEPDzev4wGA3BSrAocUFXcxsabuXu-b7iPpduybml_aq&lsparams=mh%2Cmm%2Cmn%2Cms%2Cmv%2Cmvi%2Cpl%2Cinitcwndbps&lsig=AG3C_xAwRgIhAImcGXI7MTFPqyNdXpKcdZfO-Ae8vtiJO8xxr5wuNkf_AiEA34iaOv0x98aqAtCPpnsAfNcMb1VK5gDk3jGocxKrXtM%3D'
[download] You_can_-j_REJECT_but_you_can_not_hide_-_Global_scanning_of_the_IPv6_Internet_33c3_-_deutsche_Ubers-rXUafEqpTvU.mp4 has already been downloaded

[download] 100% of 289.24MiB

Description

I waited a few weeks thinking that this would be easily noticed. So far it has not been fixed or reported AFAIK.

I download a youtube video and youtube-dl's request for the subtitle gets a 404 response. I expect that youtube-dl, when it finds a subtitle, will download it. This only happens with some videos on youtube.com. I could probably find more URLs that trigger this if you need them.

Thanks!

gunchev commented 3 years ago

You can try https://github.com/mr700/youtube-dl/commit/ff974a59e07e5b3a01f4c4bbc4f229e7107b8bbf, I am just checking how to properly submit pull request. Basically it skips missing subtitles. Looks like the auto translated ones do not download any more.

For https://www.youtube.com/watch?v=GXy__kBVq1M only 11 download successfully, the rest you can see on YouTube (ex: Bulgarian), but they are in the auto-translated submenu.

Of course if you want the auto-translated subtitles this won't work for you...

ballsystemlord commented 3 years ago

Based on what I understand about youtube, unless someone intentionally translates a video, auto-translated subtitles are all that I can possibly hope to receive. I'm not trying to embed anything. External subtitles are just fine for me. The only difference your diff has is that it doesn't cause ytdl to throw a warning. I can deal with an excess warning message.

I just wanted to be able to get the subtitles, if there are any to be gotten. For all I know, youtube.com changed how the request is made to retrieve subs. Or, they changed the processing steps necessary to get the correct URL.

jefferson018200306 commented 3 years ago

There's no error because I have successfully downloaded your auto-subtitle

youtube-dl "https://www.youtube.com/watch?v=rXUafEqpTvU" --write-auto-sub --sub-lang en --skip-download

[info] Writing video subtitles to: You can -j REJECT but you can not hide - Global scanning of the IPv6 Internet (33c3) - deutsche Übers-rXUafEqpTvU.en.vtt

gunchev commented 3 years ago

If the subtitles are embedding then the whole operation fails (with attempt to embed missing file and an exception). The patch makes it only print a warning removes the subtitle file from the list for embedding later, which makes it similar to the not embedding download.

Maybe something changed the way one has to download the auto-translated subtitles, it used to work no more than a year ago. The problem initially was with Vietnamese subtitles only IIRC, then started "spreading" to more languages...

ballsystemlord commented 3 years ago

There's no error because I have successfully downloaded your auto-subtitle

youtube-dl "https://www.youtube.com/watch?v=rXUafEqpTvU" --write-auto-sub --sub-lang en --skip-download

[info] Writing video subtitles to: You can -j REJECT but you can not hide - Global scanning of the IPv6 Internet (33c3) - deutsche Übers-rXUafEqpTvU.en.vtt

I can confirm that your method works correctly. My command, which doesn't request embedding AFAIK, still does not work.

ballsystemlord commented 3 years ago

I have not upgraded youtube-dl or python or any other thing. However, I can not reproduce this error with the URL I placed above. This could be a problem on youtube's end of the connection. Therefore, it is important to have youtube-dl adjust to this error and fetch the subtitle via the method I verified to work above (Note: At that time I was still able to reproduce what I originally reported.). https://github.com/ytdl-org/youtube-dl/issues/28802#issuecomment-823374921

ballsystemlord commented 3 years ago

Update: This is really strange. The last version of yt-dl had a problem even using the above method https://github.com/ytdl-org/youtube-dl/issues/28802#issuecomment-823374921 and then it worked with the latest version of yt-dl. But, there are other URLs that don't work with the latest version of yt-dl.

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--socket-timeout', '60', '-i', '--no-call-home', '--restrict-filenames', '--no-playlist', '--hls-prefer-native', '-f', '22,20,18', '--write-auto-sub', '--user-agent', 'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:56.0) Gecko/20100101 Firefox/56.0', '--cookies', '/mnt/md7/to-go-through/cookies.txt', '--reject-title', 'traduction.fran.+aise|La.traducci.+n.espa.+ola|deutsche.+bersetzung', '--write-auto-sub', '--sub-lang', 'en', '--skip-download', '--verbose', 'https://www.youtube.com/watch?v=qYHwdPLamOs']
[debug] Encodings: locale ANSI_X3.4-1968, fs ascii, out ANSI_X3.4-1968, pref ANSI_X3.4-1968
[debug] youtube-dl version 2021.04.26
[debug] Python version 3.5.3 (CPython) - Linux-5.11.17-nopreempt-Radeon-SI-dav12-x86_64-with-debian-9
[debug] exe versions: ffmpeg 3.2.15-0, ffprobe 3.2.15-0, rtmpdump 2.4
[debug] Proxy map: {}
[youtube] qYHwdPLamOs: Downloading webpage
[youtube] Downloading just video qYHwdPLamOs because of --no-playlist
[info] qYHwdPLamOs: downloading video in 2 formats
[info] Writing video subtitles to: 35C3_-_Wallet_Security-qYHwdPLamOs.en.vtt
WARNING: Unable to download subtitle for "en": Unable to download webpage: HTTP Error 404: Not Found (caused by <HTTPError 404: 'Not Found'>); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
[info] Writing video subtitles to: 35C3_-_Wallet_Security-qYHwdPLamOs.en.vtt
WARNING: Unable to download subtitle for "en": Unable to download webpage: HTTP Error 404: Not Found (caused by <HTTPError 404: 'Not Found'>); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.

I have no idea what is going on. It's strange that youtube would return 404 for a URL (the one in the original post). Then it works later. Then I try some others and some of them fail. So I test with the latest version and then they succeed. But still, other subtitle URLs fail.

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['--socket-timeout', '60', '-i', '--no-call-home', '--restrict-filenames', '--no-playlist', '--hls-prefer-native', '-f', '22,20,18', '--write-auto-sub', '--user-agent', 'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:56.0) Gecko/20100101 Firefox/56.0', '--cookies', '/mnt/md7/to-go-through/cookies.txt', '--reject-title', 'traduction.fran.+aise|La.traducci.+n.espa.+ola|deutsche.+bersetzung', '--write-auto-sub', '--sub-lang', 'en', '--skip-download', '--verbose', 'https://www.youtube.com/watch?v=qYHwdPLamOs']
[debug] Encodings: locale ANSI_X3.4-1968, fs ascii, out ANSI_X3.4-1968, pref ANSI_X3.4-1968
[debug] youtube-dl version 2021.04.26
[debug] Python version 3.5.3 (CPython) - Linux-5.11.17-nopreempt-Radeon-SI-dav12-x86_64-with-debian-9
[debug] exe versions: ffmpeg 3.2.15-0, ffprobe 3.2.15-0, rtmpdump 2.4
[debug] Proxy map: {}
[youtube] qYHwdPLamOs: Downloading webpage
[youtube] Downloading just video qYHwdPLamOs because of --no-playlist
[info] qYHwdPLamOs: downloading video in 2 formats
[info] Writing video subtitles to: 35C3_-_Wallet_Security-qYHwdPLamOs.en.vtt
WARNING: Unable to download subtitle for "en": Unable to download webpage: HTTP Error 404: Not Found (caused by <HTTPError 404: 'Not Found'>); please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see  https://yt-dl.org/update  on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.
[info] Writing video subtitles to: 35C3_-_Wallet_Security-qYHwdPLamOs.en.vtt

Now, getting one subtitle of the above 2 succeeds. What is going on here?

ballsystemlord commented 3 years ago

Ok, this looks like it is some sort of time delay type of problem.