ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
131.31k stars 9.95k forks source link

Issue: Broken Site: TVP (extractor partially working) #22160

Closed the-researcher closed 1 year ago

the-researcher commented 5 years ago

Checklist

Verbose log

[debug] System config: [u'-o', u'/mnt/external/youtube-dl/%(title)s.%(ext)s']
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: [u'-v', u'--format', u'bestvideo+bestaudio/best', u'--merge-output-format', u'mp4', u'https://vod.tvp.pl/video/teleexpress,15082019-1700,43664673']
[debug] Encodings: locale UTF-8, fs UTF-8, out UTF-8, pref UTF-8
[debug] youtube-dl version 2019.08.13
[debug] Python version 2.7.13 (CPython) - Linux-4.14.66+-armv7l-with-debian-9.9
[debug] exe versions: ffmpeg 3.2.12-1, ffprobe 3.2.12-1
[debug] Proxy map: {}
[tvp] 43664673: Downloading webpage
[tvp:embed] 43664673: Downloading webpage
ERROR: tvp:embed said: Transmisja została zakończona lub materiał niedostępny
Traceback (most recent call last):
  File "/usr/local/bin/youtube-dl/youtube_dl/YoutubeDL.py", line 796, in extract_info
    ie_result = ie.extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/common.py", line 530, in extract
    ie_result = self._real_extract(url)
  File "/usr/local/bin/youtube-dl/youtube_dl/extractor/tvp.py", line 129, in _real_extract
    self.IE_NAME, clean_html(error)), expected=True)
ExtractorError: tvp:embed said: Transmisja zosta\u0142a zako\u0144czona lub materia\u0142 niedost\u0119pny

Description

In previous versions of youtube-dl, I have been able to successfully download episodes of Teleexpress from TVP. However, with the updated version of youtube-dl, I am getting an error (translated into English) of this content is either unavailable, or tranmission has ceased.

I can confirm that I can see TVP VOD content through my VPN on my computer. However, when I try to download the content with youtube-dl, I am getting an error.

For what it's worth, I can use the m3u8x extractor to pull the content to my machine, and manually remux to an mp4 format... but youtube-dl was wonderful and easier to use.

Please, please, please fix!

supercoil commented 4 years ago

As far as I can tell, there's a problem with the highest quality audio/video that youtube-dl tries to download automatically when no further options are given. Here's a sample output from a TVP programme with youtube-dl -F (I've removed a few formats for clarity):

[Info] Available formats for 44938430: format code extension resolution note mss-audio-175 isma audio only 175k , AACL (44100Hz) [...] mss-video-3199 ismv 1280x720 3199k , H264, video only mss-video-4489 ismv 1920x1080 4489k , H264, video only hds-570 flv 398x224 570k hls-570 mp4 398x224 570k , avc1.64000d, 25.0fps, mp4a.40.2 http-570 mp4 398x224 570k , avc1.64000d, 25.0fps, mp4a.40.2 [...] hds-4663 flv 1920x1080 4663k hls-4663 mp4 1920x1080 4663k , avc1.640028, 25.0fps, mp4a.40.2 (best)

youtube-dl tries to download the .isma audio and highest quality .ismv video files and then merge them, but these give the following error: [download] Got server HTTP error: HTTP Error 400: Bad Request. Retrying fragment 1 (attempt 1 of 10)... I get the same error when I try to manually download these with youtube-dl -f. When I try the bottom (best) format with youtube-dl -f hls-4663 it works fine.

supercoil commented 4 years ago

Works again with version 2019.11.28

supercoil commented 4 years ago

Well, I thought it worked, but it's not 100%. I tried a few episodes, including ones that didn't work earlier, and they were OK. But then a few other episodes gave the same error I mentioned before.

the-researcher commented 4 years ago

Unsure of this hold-over fix, but instead of simply providing the URL, I have had better success in forcing which format to pick.

So, this doesn't work: youtube-dl https://vod.tvp.pl/sess/player/video/45398537

(I'm taking the video UUID (in this case, 45398537), and adding it to the base sess player used by the site)

But this works: youtube-dl https://vod.tvp.pl/sess/player/video/45398537 -f 'hls-4749'

the-researcher commented 3 years ago

Hi everyone,

I've attached a debug chain for both link scenarios. For this example, both scenarios will point to the same content: a broadcast of Teleexpress on 2021-FEB-15 @ 17:00 CEST (GMT+1).

I'm using -f 'best[ext=mp4][height=1080]' , as that is the most painless format for my media setup. Just download the file, move it to a proper directory, and refresh/rescan the library.

Scenario 1: using the regular URL on the site vod.tvp.pl/video/... link

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-o', 'C:/video/TVP/%(title)s.%(ext)s', '-f', 'best[ext=mp4][height=1080]', 'https://vod.tvp.pl/video/teleexpress,15022021-1700,51970528', '--verbose']
[debug] Encodings: locale cp1252, fs mbcs, out cp850, pref cp1252
[debug] youtube-dl version 2021.02.10
[debug] Python version 3.4.4 (CPython) - Windows-10-10.0.18362
[debug] exe versions: ffmpeg 4.2.2, ffprobe 4.2.2
[debug] Proxy map: {}
[tvp] 51970528: Downloading webpage
[tvp:embed] 51970528: Downloading webpage
ERROR: tvp:embed said: Transmisja została zakończona lub materiał niedostępny
Traceback (most recent call last):
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmp5n2jym44\build\youtube_dl\YoutubeDL.py", line 806, in wrapper
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmp5n2jym44\build\youtube_dl\YoutubeDL.py", line 827, in __extract_info
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmp5n2jym44\build\youtube_dl\extractor\common.py", line 532, in extract
  File "C:\Users\dst\AppData\Roaming\Build archive\youtube-dl\ytdl-org\tmp5n2jym44\build\youtube_dl\extractor\tvp.py", line 129, in _real_extract
youtube_dl.utils.ExtractorError: tvp:embed said: Transmisja została zakończona lub materiał niedostępny

Scenario 2: using the regular URL on the site vod.tvp.pl/sess/player/video/... link

[debug] System config: []
[debug] User config: []
[debug] Custom config: []
[debug] Command-line args: ['-o', 'C:/video/TVP/%(title)s.%(ext)s', '-f', 'best[ext=mp4][height=1080]', 'https://vod.tvp.pl/sess/player/video/51970528', '--verbose']
[debug] Encodings: locale cp1252, fs mbcs, out cp850, pref cp1252
[debug] youtube-dl version 2021.02.10
[debug] Python version 3.4.4 (CPython) - Windows-10-10.0.18362
[debug] exe versions: ffmpeg 4.2.2, ffprobe 4.2.2
[debug] Proxy map: {}
[tvp] 51970528: Downloading webpage
[tvp:embed] 51970543: Downloading webpage
[tvp:embed] 51970543: Downloading ISM manifest
[tvp:embed] 51970543: Downloading f4m manifest
[tvp:embed] 51970543: Downloading m3u8 information
[tvp:embed] 51970543: Checking video URL
[tvp:embed] 51970543: Checking video URL
[tvp:embed] 51970543: Checking video URL
[tvp:embed] 51970543: Checking video URL
[tvp:embed] 51970543: Checking video URL
[tvp:embed] 51970543: Checking video URL
[tvp:embed] 51970543: Checking video URL
[tvp:embed] 51970543: video URL is invalid, skipping: HTTP Error 404: Not Found
[debug] Invoking downloader on 'http://rsdt-waw802-229.tvp.pl/token/video/vod/51970543/20210216/2153425654/009526ea-fb00-41e9-984c-6d130e8f7857/video.ism/nv-hls-index-f7-v1-a1.m3u8'
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 228
[download] Destination: C:\video\TVP\Teleexpress, 15.02.2021, 17_00.mp4
[download] 100% of 567.80MiB in 06:56
[debug] ffmpeg command line: ffprobe -show_streams "file:C:\video\TVP\Teleexpress, 15.02.2021, 17_00.mp4"
[ffmpeg] Fixing malformed AAC bitstream in "C:\video\TVP\Teleexpress, 15.02.2021, 17_00.mp4"
[debug] ffmpeg command line: ffmpeg -y -loglevel "repeat+info" -i "file:C:\video\TVP\Teleexpress, 15.02.2021, 17_00.mp4" -c copy -f mp4 "-bsf:a" aac_adtstoasc "file:C:\video\TVP\Teleexpress, 15.02.2021, 17_00.temp.mp4"

To be clear: the error thrown during usage of the URL in Scenario 1 is completely inaccurate (bonkers), as the underlying content is available, and successfully downloaded in Scenario 2.

I would be more than happy to test any updated extractors for TVP, as I currently use a VPN to download daily news shows (and other content) using the underlying /sess/ TVP extractor, as that currently works. Please flag me with a mention, as I check this bug/issue on a near daily basis to see if progress is being made. If a beta/test build is compiled for Windows 10, I'll promise to test it.

Please let me know how I can help!

the-researcher commented 3 years ago

Updated the title for the bug, because the current TVP extractor is kinda like a 3 legged-dog: it works, you love it to death, it's been with you this entire time, but just imagine if it had four legs.