yt-dlp / yt-dlp

A feature-rich command-line audio/video downloader
https://discord.gg/H5MNcFW63r
The Unlicense
90.16k stars 6.99k forks source link

BYUTV appears broken: 'Unsupported URL' error for BYUTV, which is on the list of supported sites #6189

Open 5076722439440 opened 1 year ago

5076722439440 commented 1 year ago

DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE

Checklist

Region

United States of America

Provide a description that is worded well enough to be understood

Byutv is explicitly listed on the project's list of supported sites, and yet when I try to download any video from it (the given one is just an example, I tried dozens), it says 'Unsupported URL', and the download fails. This happens whether or not I use the --cookies-from-browser flag, and regardless of which browser I use it with (I tried Chrome and Firefox). All videos on Byutv play without any issues in both Chrome and Firefox. BYUTV does not require an account to view videos (everything is free) so I didn't have any credentials to pass along to the program.

Provide verbose output that clearly demonstrates the problem

Complete Verbose Output

[debug] Command-line config: ['-vU', 'https://www.byutv.org/2d5dd209-882f-4357-a055-a33394d1a40d?utm_source=byub&utm_medium=share&utm_campaign=share_2023&utm_content=Episode', '--cookies-from-browser', 'firefox']
[debug] User config "C:\Users\micro\AppData\Roaming\yt-dlp\config.txt": ['-o', 'C:/Users/micro/Downloads/,yt-dlp/%(title)s.%(ext)s', '--write-auto-subs', '--sub-langs', 'en.*', '--embed-subs', '--ffmpeg-location', 'C:/FFmpeg/bin', '-S', 'res:1080,fps', '--trim-filenames', '250']
[debug] Encodings: locale cp1252, fs utf-8, pref cp1252, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2023.01.06 [6becd25] (win_exe)
[debug] Python 3.8.10 (CPython AMD64 64bit) - Windows-10-10.0.22621-SP0 (OpenSSL 1.1.1k  25 Mar 2021)
[debug] exe versions: ffmpeg 2022-10-02-git-5f02a261a2-full_build-www.gyan.dev (setts), ffprobe 2022-10-02-git-5f02a261a2-full_build-www.gyan.dev
[debug] Optional libraries: Cryptodome-3.16.0, brotli-1.0.9, certifi-2022.12.07, mutagen-1.46.0, sqlite3-2.6.0, websockets-10.4
[Cookies] Extracting cookies from firefox
[debug] Extracting cookies from: "C:\Users\micro\AppData\Roaming\Mozilla\Firefox\Profiles\tz85ugn0.default-release\cookies.sqlite"
[Cookies] Extracted 24 cookies from firefox
[debug] Proxy map: {}
[debug] Loaded 1760 extractors
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2023.01.06, Current version: 2023.01.06
yt-dlp is up to date (2023.01.06)
[generic] Extracting URL: https://www.byutv.org/2d5dd209-882f-4357-a055-a33394d1a40d?utm_source=byub&utm_medium=share&utm_campaign=share_2023&utm_content=Episode
[generic] 2d5dd209-882f-4357-a055-a33394d1a40d?utm_source=byub&utm_medium=share&utm_campaign=share_2023&utm_content=Episode: Downloading webpage
[redirect] Following redirect to https://www.byutv.org/2d5dd209-882f-4357-a055-a33394d1a40d/holly-hobbie-the-show-starter?utm_source=byub&utm_medium=share&utm_campaign=share_2023&utm_content=Episode
[generic] Extracting URL: https://www.byutv.org/2d5dd209-882f-4357-a055-a33394d1a40d/holly-hobbie-the-show-starter?utm_source=byub&utm_medium=share&utm_campaign=share_2023&utm_content=Episode
[generic] holly-hobbie-the-show-starter?utm_source=byub&utm_medium=share&utm_campaign=share_2023&utm_content=Episode: Downloading webpage
WARNING: [generic] Falling back on generic information extractor
[generic] holly-hobbie-the-show-starter?utm_source=byub&utm_medium=share&utm_campaign=share_2023&utm_content=Episode: Extracting information
[debug] Looking for embeds
ERROR: Unsupported URL: https://www.byutv.org/2d5dd209-882f-4357-a055-a33394d1a40d/holly-hobbie-the-show-starter?utm_source=byub&utm_medium=share&utm_campaign=share_2023&utm_content=Episode
Traceback (most recent call last):
  File "yt_dlp\YoutubeDL.py", line 1502, in wrapper
  File "yt_dlp\YoutubeDL.py", line 1578, in __extract_info
  File "yt_dlp\extractor\common.py", line 680, in extract
  File "yt_dlp\extractor\generic.py", line 2523, in _real_extract
yt_dlp.utils.UnsupportedError: Unsupported URL: https://www.byutv.org/2d5dd209-882f-4357-a055-a33394d1a40d/holly-hobbie-the-show-starter?utm_source=byub&utm_medium=share&utm_campaign=share_2023&utm_content=Episode
dirkf commented 1 year ago

For the moment use https://www.byutv.org/watch/2d5dd209-882f-4357-a055-a33394d1a40d (ie insert /watch after the domain name).

The /watch or /player components expected by the extractor are now optional, and redirect to a URL that has the UUID as the first path component, as in the problem URL. Same issue for yt-dl.

5076722439440 commented 1 year ago

@dirkf Thanks for the fast response! That is a functional workaround. I have no software development abilities so I can't contribute, but for what it's worth, it would be great for the end-user if the code was updated so that this workaround wouldn't be required in the future. In other words, ideally the extractor would be updated to understand that the /watch and /player components are now optional for BYUTV.

dirkf commented 1 year ago

It's a simple fix for developers and I've updated and tested the extractor for yt-dl in my development version: actually there were other bugs in the extraction that I had to fix too.

Some shows at BYUTV are DRM protected but yt-dlp apparently gives a useful diagnostic for that case.

mthorpedo commented 1 year ago

@dirkf While the original issue is still present, I'm now having problems with the workaround too

[debug] Command-line config: ['-vU', 'https://www.byutv.org/watch/b6909478-f2d9-4594-9d63-42c6be64cd3e/']
[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version stable@2023.09.24 [088add956] (pip)
[debug] Python 3.11.5 (CPython arm64 64bit) - macOS-14.0-arm64-arm-64bit (OpenSSL 3.1.3 19 Sep 2023)
[debug] exe versions: ffmpeg 6.0 (setts), ffprobe 6.0, phantomjs 2.1.1
[debug] Optional libraries: Cryptodome-3.19.0, brotli-1.1.0, certifi-2023.07.22, mutagen-1.47.0, sqlite3-3.43.1, websockets-11.0.3
[debug] Proxy map: {}
[debug] Loaded 1886 extractors
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Available version: stable@2023.09.24, Current version: stable@2023.09.24
yt-dlp is up to date (stable@2023.09.24)
[BYUtv] Extracting URL: https://www.byutv.org/watch/b6909478-f2d9-4594-9d63-42c6be64cd3e/
[BYUtv] b6909478-f2d9-4594-9d63-42c6be64cd3e: Downloading JSON metadata
ERROR: [BYUtv] b6909478-f2d9-4594-9d63-42c6be64cd3e: Unable to download JSON metadata: HTTP Error 404: Not Found (caused by <HTTPError 404: Not Found>); please report this issue on  https://github.com/yt-dlp/yt-dlp/issues?q= , filling out the appropriate issue template. Confirm you are on the latest version using  yt-dlp -U
  File "/opt/homebrew/Cellar/yt-dlp/2023.9.24/libexec/lib/python3.11/site-packages/yt_dlp/extractor/common.py", line 715, in extract
    ie_result = self._real_extract(url)
                ^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/yt-dlp/2023.9.24/libexec/lib/python3.11/site-packages/yt_dlp/extractor/byutv.py", line 55, in _real_extract
    video = self._download_json(
            ^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/yt-dlp/2023.9.24/libexec/lib/python3.11/site-packages/yt_dlp/extractor/common.py", line 1069, in download_content
    res = getattr(self, download_handle.__name__)(url_or_request, video_id, **kwargs)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/yt-dlp/2023.9.24/libexec/lib/python3.11/site-packages/yt_dlp/extractor/common.py", line 1033, in download_handle
    res = self._download_webpage_handle(
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/yt-dlp/2023.9.24/libexec/lib/python3.11/site-packages/yt_dlp/extractor/common.py", line 903, in _download_webpage_handle
    urlh = self._request_webpage(url_or_request, video_id, note, errnote, fatal, data=data, headers=headers, query=query, expected_status=expected_status)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/yt-dlp/2023.9.24/libexec/lib/python3.11/site-packages/yt_dlp/extractor/common.py", line 860, in _request_webpage
    raise ExtractorError(errmsg, cause=err)

  File "/opt/homebrew/Cellar/yt-dlp/2023.9.24/libexec/lib/python3.11/site-packages/yt_dlp/networking/_urllib.py", line 410, in _send
    res = opener.open(urllib_req, timeout=float(request.extensions.get('timeout') or self.timeout))
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.11/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 525, in open
    response = meth(req, response)
               ^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.11/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 634, in http_response
    response = self.parent.error(
               ^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.11/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 563, in error
    return self._call_chain(*args)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.11/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 496, in _call_chain
    result = func(*args)
             ^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.11/3.11.5/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 643, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 404: Not Found

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/homebrew/Cellar/yt-dlp/2023.9.24/libexec/lib/python3.11/site-packages/yt_dlp/YoutubeDL.py", line 4051, in urlopen
    return self._request_director.send(req)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/yt-dlp/2023.9.24/libexec/lib/python3.11/site-packages/yt_dlp/networking/common.py", line 114, in send
    response = handler.send(request)
               ^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/yt-dlp/2023.9.24/libexec/lib/python3.11/site-packages/yt_dlp/networking/_helper.py", line 204, in wrapper
    return func(self, *args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/yt-dlp/2023.9.24/libexec/lib/python3.11/site-packages/yt_dlp/networking/common.py", line 325, in send
    return self._send(request)
           ^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/yt-dlp/2023.9.24/libexec/lib/python3.11/site-packages/yt_dlp/networking/_urllib.py", line 415, in _send
    raise HTTPError(UrllibResponseAdapter(e.fp), redirect_loop='redirect error' in str(e)) from e
yt_dlp.networking.exceptions.HTTPError: HTTP Error 404: Not Found

The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/opt/homebrew/Cellar/yt-dlp/2023.9.24/libexec/lib/python3.11/site-packages/yt_dlp/extractor/common.py", line 847, in _request_webpage
    return self._downloader.urlopen(self._create_request(url_or_request, data, headers, query))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/yt-dlp/2023.9.24/libexec/lib/python3.11/site-packages/yt_dlp/YoutubeDL.py", line 4070, in urlopen
    raise _CompatHTTPError(e) from e
yt_dlp.networking.exceptions._CompatHTTPError: HTTP Error 404: Not Found
gappie commented 1 year ago

Having the same issue with BYUTV. Is there an interim fix for this?

dirkf commented 1 year ago

The API URL that the extractor uses to get the media links is no longer valid.

For devs:

dirkf commented 1 year ago

Indeed, this gets the video data:

        media_id = self._search_regex(
            r'''(?s)\bac\s*\.\s*media\s*=\s*\{\s*id\s*:\s*["']([a-f\d]{8}-(?:[a-f\d]{4}-){3}[a-f\d]{12})''',
            webpage, 'media ID')

        api_key = self._search_regex(
            r'''(?s)\bapiKey\s*:\s*["']([a-z\d-]+)''',
            webpage, 'API key', fatal=False) or 'byutv-web-dk94tsvophi'

        video = self._download_json(
            'https://api.byub.org/media/v1/public/media/' + media_id,
            display_id, headers={
                # 401/403 if missing/malformed
                'x-byub-client': api_key,
                # 400 if missing - are other values (eg, us) useful?
                'x-byub-location': 'global',
                # headers used by BYU JS but not currently required
                # 'x-byub-device': '98394a8f-7a5b-48fd-886f-f79bb856c8aa',
                # 'x-byub-clientversion': '5.33.119',
                # 'Referer': url,
                # 'Origin': 'https://www.byutv.org',
            })
praeluceo commented 6 months ago

Indeed, this gets the video data:

        media_id = self._search_regex(
            r'''(?s)\bac\s*\.\s*media\s*=\s*\{\s*id\s*:\s*["']([a-f\d]{8}-(?:[a-f\d]{4}-){3}[a-f\d]{12})''',
            webpage, 'media ID')

        api_key = self._search_regex(
            r'''(?s)\bapiKey\s*:\s*["']([a-z\d-]+)''',
            webpage, 'API key', fatal=False) or 'byutv-web-dk94tsvophi'

        video = self._download_json(
            'https://api.byub.org/media/v1/public/media/' + media_id,
            display_id, headers={
                # 401/403 if missing/malformed
                'x-byub-client': api_key,
                # 400 if missing - are other values (eg, us) useful?
                'x-byub-location': 'global',
                # headers used by BYU JS but not currently required
                # 'x-byub-device': '98394a8f-7a5b-48fd-886f-f79bb856c8aa',
                # 'x-byub-clientversion': '5.33.119',
                # 'Referer': url,
                # 'Origin': 'https://www.byutv.org',
            })

I tried this fix by modifying the byutv.py extractor, however it still throws errors and fails to download the file:

$ yt-dlp -vU "https://www.byutv.org/2d5dd209-882f-4357-a055-a33394d1a40d?utm_source=byub&utm_medium=share&utm_campaign=share_2023&utm_content=Episode"
[debug] Command-line config: ['-vU', 'https://www.byutv.org/2d5dd209-882f-4357-a055-a33394d1a40d?utm_source=byub&utm_medium=share&utm_campaign=share_2023&utm_content=Episode']
[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version stable@2024.04.09 from yt-dlp/yt-dlp [ff0779267] (pip)
[debug] Python 3.10.12 (CPython x86_64 64bit) - Linux-5.15.0-101-generic-x86_64-with-glibc2.35 (OpenSSL 3.0.2 15 Mar 2022, glibc 2.35)
[debug] exe versions: ffmpeg 4.4.2 (setts), ffprobe 4.4.2
[debug] Optional libraries: Cryptodome-3.20.0, brotli-1.1.0, certifi-2020.06.20, mutagen-1.45.1, requests-2.31.0, secretstorage-3.3.1, sqlite3-3.37.2, urllib3-2.2.1, websockets-12.0
[debug] Proxy map: {}
[debug] Request Handlers: urllib, requests, websockets
[debug] Loaded 1810 extractors
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: stable@2024.04.09 from yt-dlp/yt-dlp
yt-dlp is up to date (stable@2024.04.09 from yt-dlp/yt-dlp)
[generic] Extracting URL: https://www.byutv.org/2d5dd209-882f-4357-a055-a33394d1a40d?utm_source=byub&utm_medium=share&utm_campaign=share_2023&utm_content=Episode
[generic] 2d5dd209-882f-4357-a055-a33394d1a40d?utm_source=byub&utm_medium=share&utm_campaign=share_2023&utm_content=Episode: Downloading webpage
[redirect] Following redirect to https://www.byutv.org/2d5dd209-882f-4357-a055-a33394d1a40d/holly-hobbie-the-show-starter?utm_source=byub&utm_medium=share&utm_campaign=share_2023&utm_content=Episode
[generic] Extracting URL: https://www.byutv.org/2d5dd209-882f-4357-a055-a33394d1a40d/holly-hobbie-the-show-starter?utm_source=byub&utm_medium=share&utm_campaign=share_2023&utm_content=Episode
[generic] holly-hobbie-the-show-starter?utm_source=byub&utm_medium=share&utm_campaign=share_2023&utm_content=Episode: Downloading webpage
WARNING: [generic] Falling back on generic information extractor
[generic] holly-hobbie-the-show-starter?utm_source=byub&utm_medium=share&utm_campaign=share_2023&utm_content=Episode: Extracting information
[debug] Looking for embeds
ERROR: Unsupported URL: https://www.byutv.org/2d5dd209-882f-4357-a055-a33394d1a40d/holly-hobbie-the-show-starter?utm_source=byub&utm_medium=share&utm_campaign=share_2023&utm_content=Episode
Traceback (most recent call last):
  File "/home/ronald/.local/lib/python3.10/site-packages/yt_dlp/YoutubeDL.py", line 1606, in wrapper
    return func(self, *args, **kwargs)
  File "/home/ronald/.local/lib/python3.10/site-packages/yt_dlp/YoutubeDL.py", line 1741, in __extract_info
    ie_result = ie.extract(url)
  File "/home/ronald/.local/lib/python3.10/site-packages/yt_dlp/extractor/common.py", line 734, in extract
    ie_result = self._real_extract(url)
  File "/home/ronald/.local/lib/python3.10/site-packages/yt_dlp/extractor/generic.py", line 2514, in _real_extract
    raise UnsupportedError(url)
yt_dlp.utils.UnsupportedError: Unsupported URL: https://www.byutv.org/2d5dd209-882f-4357-a055-a33394d1a40d/holly-hobbie-the-show-starter?utm_source=byub&utm_medium=share&utm_campaign=share_2023&utm_content=Episode

I also tried on a different URL from byu.tv and same results. So something may have changed again, it'd be awesome to have byu.tv working again!

bashonly commented 6 months ago

Looks like the site's video URL format has changed now as well

dirkf commented 6 months ago

Vexingly, the site is using a more sophisticated NUXT.js structure than is currently supported by _search_nuxt_js(), like this:

window.__NUXT__ = (function (args.../* actually a,b,c,...,A,B,C,...,aa,ab,ac,...aA,aB,aC,... */) {
/* set some properties of args that are probably bound to {} */
J.xxx = yyy; /* expression using arg values */
/* etc */
return {
    ...,
    data: [
      {
        page: J
      }
    ...,
}}(values...));
dirkf commented 6 months ago

Also, the show pages are delivered with 404 (! - wasn't this happening on another site?) and the formerly useful ld+json block is no longer being populated. Is this Α-Ω testing rather than A-B testing?