yt-dlp / yt-dlp

A feature-rich command-line audio/video downloader
https://discord.gg/H5MNcFW63r
The Unlicense
81.97k stars 6.39k forks source link

AngelStudios.com #5478

Open BenMcLean opened 1 year ago

BenMcLean commented 1 year ago

DO NOT REMOVE OR SKIP THE ISSUE TEMPLATE

Checklist

Region

USA

Provide a description that is worded well enough to be understood

I was told to open a new issue here.

Trying to download The Chosen season 2 from angelstudios.com which is available free to the public without any login.

The visual portion downloads fine but there's no audio after the pre-roll. The logs indicate that Spanish was selected which would be bad, because I want English, but in the final file, there's no Spanish either, just silence after the English pre-roll.

I've tried both the basic URL as well as this guy's post about finding the m3u8 files with the formats selected but that doesn't work either. I could provide logs of that if desired but for this first post, I'm just going to use the basic URL. (from the navbar of the browser) So in Windows 10 Powershell, my command was .\yt-dlp.exe "https://watch.angelstudios.com/thechosen/watch/episodes/season-2-episode-1-thunder" -vU *>log.txt

Ideally, I'd like to ditch the pre-roll if possible.

Provide verbose output that clearly demonstrates the problem

Complete Verbose Output

.\yt-dlp.exe : [debug] Command-line config: ['https://watch.angelstudios.com/thechose
n/watch/episodes/season-2-episode-1-thunder', '-vU']
At line:1 char:1
+ .\yt-dlp.exe "https://watch.angelstudios.com/thechosen/watch/episodes ...
+ ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
    + CategoryInfo          : NotSpecified: ([debug] Command...hunder', '-vU']:Strin 
   g) [], RemoteException
    + FullyQualifiedErrorId : NativeCommandError

[debug] Encodings: locale cp1252, fs utf-8, pref cp1252, out cp1252 (No ANSI), error 
cp1252 (No ANSI), screen cp1252 (No ANSI)
[debug] yt-dlp version 2022.10.04 [4e0511f] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.19045-SP0 
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
[debug] exe versions: ffmpeg 2022-10-20-git-eb9153b4a7-full_build-www.gyan.dev 
(setts), ffprobe 2022-10-20-git-eb9153b4a7-full_build-www.gyan.dev
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.09.24, 
mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] Loaded 1690 extractors
[debug] Fetching release info: 
https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Latest version: 2022.10.04, Current version: 2022.10.04
yt-dlp is up to date (2022.10.04)
[debug] [generic] Extracting URL: 
https://watch.angelstudios.com/thechosen/watch/episodes/season-2-episode-1-thunder
[generic] season-2-episode-1-thunder: Downloading webpage
[redirect] Following redirect to https://www.angel.com/watch/the-chosen/episode/season-2-episode-1-thunder
[debug] [generic] Extracting URL: 
https://www.angel.com/watch/the-chosen/episode/season-2-episode-1-thunder
[generic] season-2-episode-1-thunder: Downloading webpage
WARNING: [generic] Falling back on generic information extractor
[generic] season-2-episode-1-thunder: Extracting information
[debug] Looking for Brightcove embeds
[debug] Looking for embeds
[debug] Identified a JSON LD
[debug] [generic] Extracting URL: https://media.angelstudios.com/copied-from-old-acco
unt/The_Chosen/S02E01_with_CTA/2022-08-17/The_Chosen_S02E01_with_CTA.m3u8#__youtubedl
_smuggle=%7B%22force_videoid%22%3A+%22season-2-episode-1-thunder%22%2C+%22to_generic%
22%3A+true%2C+%22http_headers%22%3A+%7B%22Referer%22%3A+%22https%3A%2F%2Fwww.angel.co
m%2Fwatch%2Fthe-chosen%2Fepisode%2Fseason-2-episode-1-thunder%22%7D%7D
[generic] season-2-episode-1-thunder: Downloading webpage
[debug] Identified a direct video link
[generic] season-2-episode-1-thunder: Downloading m3u8 information
[debug] Formats sorted by: hasvid, ie_pref, lang, quality, res, fps, hdr:12(7), 
vcodec:vp9.2(10), channels, acodec, filesize, fs_approx, tbr, vbr, abr, asr, proto, 
vext, aext, hasaud, source, id
[debug] Default format spec: bestvideo*+bestaudio/best
[info] season-2-episode-1-thunder: Downloading 1 format(s): 5073+aud-Spanish
[debug] Invoking hlsnative downloader on "https://media.angelstudios.com/copied-from-
old-account/The_Chosen/S02E01_with_CTA/2022-08-17/1920x1080_5073890_avc1.640032-mp4a.
40.2.m3u8"
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 658
[download] Destination: Watch The Chosen Season 2 Episode 1 Thunder on Angel Studios [season-2-episode-1-thunder].f5073.mp4
[debug] File locking is not supported. Proceeding without locking
...
[download] 100% of    1.92GiB in 00:02:59 at 11.00MiB/s                  
[debug] Invoking hlsnative downloader on "https://media.angelstudios.com/copied-from-
old-account/The_Chosen/S02E01_with_CTA/2022-08-17/AUDIO-Spanish.m3u8"
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 497
[download] Destination: Watch The Chosen Season 2 Episode 1 Thunder on Angel Studios [season-2-episode-1-thunder].faud-Spanish.mp4
...
[download] 100% of   98.25MiB in 00:01:22 at 1.20MiB/s                   
[debug] ffmpeg command line: ffprobe -show_streams "file:Watch The Chosen Season 2 
Episode 1 Thunder on Angel Studios [season-2-episode-1-thunder].faud-Spanish.mp4"
[Merger] Merging formats into "Watch The Chosen Season 2 Episode 1 Thunder on Angel Studios [season-2-episode-1-thunder].mp4"
[debug] ffmpeg command line: ffmpeg -y -loglevel "repeat+info" -i "file:Watch The 
Chosen Season 2 Episode 1 Thunder on Angel Studios 
[season-2-episode-1-thunder].f5073.mp4" -i "file:Watch The Chosen Season 2 Episode 1 
Thunder on Angel Studios [season-2-episode-1-thunder].faud-Spanish.mp4" -c copy -map 
"0:v:0" -map "1:a:0" "-bsf:a:0" aac_adtstoasc -movflags "+faststart" "file:Watch The 
Chosen Season 2 Episode 1 Thunder on Angel Studios 
[season-2-episode-1-thunder].temp.mp4"
Deleting original file Watch The Chosen Season 2 Episode 1 Thunder on Angel Studios [season-2-episode-1-thunder].f5073.mp4 (pass -k to keep)
Deleting original file Watch The Chosen Season 2 Episode 1 Thunder on Angel Studios [season-2-episode-1-thunder].faud-Spanish.mp4 (pass -k to keep)
pukkandan commented 1 year ago

Workaround: Pass the longer URL. eg - https://www.angel.com/watch/the-chosen/episode/4942ad03-6529-4101-b74c-2f0e331b8a5c/season-2/episode-1/thunder

cc @AxiosDeminence

(If you would like not to be pinged for issues related to this site in future, let me know)

BenMcLean commented 1 year ago

Workaround: Pass the longer URL. eg - https://www.angel.com/watch/the-chosen/episode/4942ad03-6529-4101-b74c-2f0e331b8a5c/season-2/episode-1/thunder

Unfortunately, that had the same result. The log shows Spanish was selected, but the final file has English audio in the pre-roll and nothing but silence in the main feature. It isn't even good for deaf people since there's no subtitles either.

dirkf commented 1 year ago

The audio formats get sorted in order of language A<Z and Spanish wins. Try selecting format 5073+aud-English explicitly.

BenMcLean commented 1 year ago

The audio formats get sorted in order of language A<Z and Spanish wins. Try selecting format 5073+aud-English explicitly.

OK now it has selected the English audio track, and it plays English during the pre-roll and nothing but silence during the actual feature. But at least it is English silence rather than Spanish silence, which is progress, I guess? :)

BenMcLean commented 1 year ago

Also, thanks for letting me know the secret to combining multiple formats is to use the + character. I had been trying to use a comma before.

Still has the problem with the main feature being silent though ;)

dirkf commented 1 year ago

Maybe the problem is the discontinuity in the audio HLS stream? There was a site that delivered its video with yt-dl in three parts where ffmpeg didn't understand the discontinuities, but this is the native downloader, which seemed OK. Nonetheless, this is what's in https://media.angelstudios.com/copied-from-old-account/The_Chosen/S02E01_with_CTA/2022-08-17/AUDIO-English.m3u8:

#EXTM3U
#EXT-X-VERSION:3
#EXT-X-PLAYLIST-TYPE:VOD
#EXT-X-MEDIA-SEQUENCE:0
#EXT-X-TARGETDURATION:8
#EXT-X-DISCONTINUITY-SEQUENCE:1
#EXTINF:4.608,
https://media.angelstudios.com/copied-from-old-account/The_Chosen/ctas/S02E01/audio/en/audio_English_0_playlist/0000.ts
#EXTINF:5.995,
https://media.angelstudios.com/copied-from-old-account/The_Chosen/ctas/S02E01/audio/en/audio_English_0_playlist/0001.ts
#EXTINF:6.016,
https://media.angelstudios.com/copied-from-old-account/The_Chosen/ctas/S02E01/audio/en/audio_English_0_playlist/0002.ts
#EXTINF:5.995,
https://media.angelstudios.com/copied-from-old-account/The_Chosen/ctas/S02E01/audio/en/audio_English_0_playlist/0003.ts
...
#EXTINF:6.016,
https://media.angelstudios.com/copied-from-old-account/The_Chosen/ctas/S02E01/audio/en/audio_English_0_playlist/0010.ts
#EXTINF:0.917,
https://media.angelstudios.com/copied-from-old-account/The_Chosen/ctas/S02E01/audio/en/audio_English_0_playlist/0011.ts
#EXT-X-DISCONTINUITY
#EXTINF:8.000,
https://media.angelstudios.com/copied-from-old-account/The_Chosen/S02E01/audio/eng/The_Chosen_S02E01_audio_eng_20210907_152759/0000.ts
#EXTINF:8.000,
https://media.angelstudios.com/copied-from-old-account/The_Chosen/S02E01/audio/eng/The_Chosen_S02E01_audio_eng_20210907_152759/0001.ts
...
#EXTINF:8.000,
https://media.angelstudios.com/copied-from-old-account/The_Chosen/S02E01/audio/eng/The_Chosen_S02E01_audio_eng_20210907_152759/0484.ts
#EXTINF:1.337,
https://media.angelstudios.com/copied-from-old-account/The_Chosen/S02E01/audio/eng/The_Chosen_S02E01_audio_eng_20210907_152759/0485.ts
#EXT-X-ENDLIST

The video stream also has a discontinuity at the start.

Try using the -k option (or go back to 5073,aud-English) and playing the resulting audio file directly. Maybe the main part of the audio is being discarded by the merge. If so, you could use ffmpeg or other media tools to trim the pre-roll segment from the beginning of each separate media file and merge the results manually.

BenMcLean commented 1 year ago

Even with -k it is only downloading the audio from the pre-roll. The audio for the main feature isn't being included in the resulting files. I can provide another log of this if that would help.

dirkf commented 1 year ago

There's definitely audio in the https://media.angelstudios.com/copied-from-old-account/The_Chosen/ctas/S02E01/audio/en/audio_English_0_playlist/0nnn.ts URLs. However, as an example, the whole audio manifest plays in mpv with a duration of 1hr 5 but no audio after the pre-roll.

As an extreme work-around you could manually append all those audio segments (from 0 to 485).

BenMcLean commented 1 year ago

There's definitely audio in the https://media.angelstudios.com/copied-from-old-account/The_Chosen/ctas/S02E01/audio/en/audio_English_0_playlist/0nnn.ts URLs. However, as an example, the whole audio manifest plays in mpv with a duration of 1hr 5 but no audio after the pre-roll.

As an extreme work-around you could manually append all those audio segments (from 0 to 485).

I'd need to make some kind of script to do that I suppose ... but couldn't yt-dlp be fixed to do it?

dirkf commented 1 year ago

Yes, a script would be easier in POSIX shell, but ffmpeg has its own batch function (concat demuxer).

Apparently there's some issue with discontinuities in manifests but I'm not well qualified to say whether the problem is with how yt-dlp (and presumably yt-dl too) handles the manifest, or with the hlsnative downloader, or with ffmpeg or how it's invoked. Depending on whether anyone can find a simple fix and how keen you are to archive the sound of this show, an "extreme work-around" may be your best bet.

Or, better, try the yt-dlp option --hls-split-discontinuity with format selection 5073-2+aud-English-2. Piping yt-dlp -o - --hls-split-discontinuity -f aud-English-2 ... into mpv plays what sounds like the desired soundtrack

AxiosDeminence commented 1 year ago

So this was actually brought to my attention some time ago, but I had gotten busy and was never fully able to investigate it. What I do know is that (to my knowledge), it is caused by the discontinuity in the stream and thus isn't an issue with various other episodes. I also know that adding --fixup never will also save the entire audio file but misreports the duration. Playing this audio file depending on the media player you use, it may also play the rest of the file (VLC media player stops the file but the default media player on Windows 10 continues to play it at the end of the reported duration). When you exclude this option, it appears that ffmpeg and yt-dlp will read the .ts payloads after the discontinuity but will drop them and will not save them in the output file.

I don't know enough about the internals of the yt-dlp and ffmpeg in order to figure out how to correctly amend the issue within yt-dlp or provide a suitable workaround. When downloading the audio stream with ffmpeg, it will read but not write the .ts payloads after the discontinuity.

dirkf commented 1 year ago

It seems that one could propose or enforce --hls-split-discontinuity in the extractor when a discontinuity is found. As a further option each discontinuity set (-1, -2, ...) could be a separate video in a multi-video playlist.

pukkandan commented 1 year ago

If I add a way for extractor to split the stream, can the extractor reliably tell which discontinuity makes the start/end of the actual video as opposed to ads?

BenMcLean commented 1 year ago

If I add a way for extractor to split the stream, can the extractor reliably tell which discontinuity makes the start/end of the actual video as opposed to ads?

Why not save both? Making more than one video file could be an option right?

BenMcLean commented 1 year ago

By the way, I'm not the only one having this problem, there's also this issue about the same thing that AFAIK was never really fixed.

dirkf commented 1 year ago

Does --hls-split-discontinuity -f 5073-2+aud-English-2 not solve the immediate problem?

BenMcLean commented 1 year ago

Does --hls-split-discontinuity -f 5073-2+aud-English-2 not solve the immediate problem?

When I added -k along with those, one of the resulting files had the complete episode with the audio and no pre-roll which was great!

The complete command that worked was .\yt-dlp.exe "https://watch.angelstudios.com/thechosen/watch/episodes/season-2-episode-1-thunder" --hls-split-discontinuity -k -f 5073-2+aud-English-2

I'll give this same procedure a shot with the other episodes.

BenjiFrugoni commented 1 year ago

The subtitles are also not working. You gotta get those VTTs manually

Working commands for independent streams. You can combine them like https://github.com/yt-dlp/yt-dlp/issues/5478#issuecomment-1311234248 indicates, which works like a charm for video and for audio

Audio only yt-dlp https://www.angel.com/watch/the-chosen/episode/61faa5b1-e037-4d92-b69d-e63919a5c3b7/season-1/episode-9/the-shepherd -f aud-Russian -o S00x01/Russian.m4a

Video only (But this website sometimes means "video" as in "English", so be careful if muxing something else) yt-dlp https://www.angel.com/watch/the-chosen/episode/8dfb714d-bca5-4812-8125-24fb9514cd10/season-1/episode-1/i-have-called-you-by-name -f 5161 -o S01x01/S01x01.mp4

BenMcLean commented 1 year ago

I was able to get the other video+English-audio streams from the season 2 episodes using the procedure I mentioned in my last post but as you said, no subtitles.

dirkf commented 1 year ago

This patch brings the extractor up-to-date wrt URL pattern and metadata extraction:

--- old/yt_dlp/extractor/angel.py
+++ new/yt_dlp/extractor/angel.py
@@ -1,11 +1,16 @@
 import re

 from .common import InfoExtractor
-from ..utils import url_or_none, merge_dicts
+from ..utils import (
+    determine_ext,
+    merge_dicts,
+    traverse_obj,
+    url_or_none,
+)

 class AngelIE(InfoExtractor):
-    _VALID_URL = r'https?://(?:www\.)?angel\.com/watch/(?P<series>[^/?#]+)/episode/(?P<id>[\w-]+)/season-(?P<season_number>\d+)/episode-(?P<episode_number>\d+)/(?P<title>[^/?#]+)'
+    _VALID_URL = r'https?://(?:www\.)?angel\.com/watch/(?P<series>[\w-]+)/episode(?:/(?P<id>[a-f\d-]+))?/season-(?P<season_number>\d+)(?(id)/|-)episode-(?P<episode_number>\d+)(?(id)/|-)(?P<title>[\w-]+)'
     _TESTS = [{
         'url': 'https://www.angel.com/watch/tuttle-twins/episode/2f3d0382-ea82-4cdc-958e-84fbadadc710/season-1/episode-1/when-laws-give-you-lemons',
         'md5': '4734e5cfdd64a568e837246aa3eaa524',
@@ -14,9 +19,11 @@
             'ext': 'mp4',
             'title': 'Tuttle Twins Season 1, Episode 1: When Laws Give You Lemons',
             'description': 'md5:73b704897c20ab59c433a9c0a8202d5e',
+            'timestamp': 1634065200,
+            'upload_date': '20211012',
             'thumbnail': r're:^https?://images.angelstudios.com/image/upload/angel-app/.*$',
-            'duration': 1359.0
-        }
+            'duration': 1328.0
+        },
     }, {
         'url': 'https://www.angel.com/watch/the-chosen/episode/8dfb714d-bca5-4812-8125-24fb9514cd10/season-1/episode-1/i-have-called-you-by-name',
         'md5': 'e4774bad0a5f0ad2e90d175cafdb797d',
@@ -25,23 +32,64 @@
             'ext': 'mp4',
             'title': 'The Chosen Season 1, Episode 1: I Have Called You By Name',
             'description': 'md5:aadfb4827a94415de5ff6426e6dee3be',
+            'upload_date': '20170920',
+            'timestamp': 1505934000,
             'thumbnail': r're:^https?://images.angelstudios.com/image/upload/angel-app/.*$',
             'duration': 3276.0
-        }
+        },
+    }, {
+        'url': 'https://www.angel.com/watch/the-chosen/episode/season-1-episode-1-i-have-called-you-by-name',
+        'md5': 'e4774bad0a5f0ad2e90d175cafdb797d',
+        'info_dict': {
+            'id': '8dfb714d-bca5-4812-8125-24fb9514cd10',
+            'ext': 'mp4',
+            'title': 'The Chosen Season 1, Episode 1: I Have Called You By Name',
+            'description': 'md5:aadfb4827a94415de5ff6426e6dee3be',
+            'upload_date': '20170920',
+            'timestamp': 1505934000,
+            'thumbnail': r're:^https?://images.angelstudios.com/image/upload/angel-app/.*$',
+            'duration': 3276.0
+        },
+        'params': {
+            # same download as above
+            'skip_download': True,
+        },
     }]

     def _real_extract(self, url):
-        video_id = self._match_id(url)
-        webpage = self._download_webpage(url, video_id)
+        m = self._match_valid_url(url)
+        video_id, display_id = m.group('id', 'title')

-        json_ld = self._search_json_ld(webpage, video_id)
+        webpage = self._download_webpage(url, video_id or display_id)

+        json_ld = None
+        if not video_id:
+            for json_ld in self._yield_json_ld(webpage, display_id):
+                if 'embedUrl' in json_ld:
+                    video_id = self._match_id(json_ld.pop('embedUrl'))
+                    json_ld = self._json_ld(json_ld, video_id)
+                    break
+            else:
+                video_id = display_id
+        if not json_ld:
+            json_ld = self._search_json_ld(webpage, video_id)
+        video_url = json_ld.pop('url')
+        ext = determine_ext(video_url)
+        if ext != 'm3u8':
+            next_data = self._search_nextjs_data(webpage, video_id)
+            video_url = traverse_obj(
+                next_data,
+                ('props', 'pageProps', 'episode', 'source', 'url'),
+                expected_type=url_or_none)
+            ext = determine_ext(video_url)
+        if ext != 'm3u8':
+            self.raise_no_formats('No manifest found', expected=True)
         formats, subtitles = self._extract_m3u8_formats_and_subtitles(
-            json_ld.pop('url'), video_id, note='Downloading HD m3u8 information')
+            video_url, video_id)

         info_dict = {
             'id': video_id,
-            'title': self._og_search_title(webpage),
+            'title': re.sub(r'^\s*Watch\s+([\w\s,:-]+?)\s+on\s+Angel\s+Studios\s*$', r'\1', self._og_search_title(webpage) or '') or None,
             'description': self._og_search_description(webpage),
             'formats': formats,
             'subtitles': subtitles
@@ -49,7 +97,7 @@

         # Angel uses cloudinary in the background and supports image transformations.
         # We remove these transformations and return the source file
-        base_thumbnail_url = url_or_none(self._og_search_thumbnail(webpage)) or json_ld.pop('thumbnails')
+        base_thumbnail_url = url_or_none(self._og_search_thumbnail(webpage)) or traverse_obj(json_ld.pop('thumbnails'), (0, 'url'), expected_type=url_or_none)
         if base_thumbnail_url:
             info_dict['thumbnail'] = re.sub(r'(/upload)/.+(/angel-app/.+)$', r'\1\2', base_thumbnail_url)

With --hls-split-discontinuity --write-subs and picking the -2 formats for the desired resolution and language for video 4942ad03-6529-4101-b74c-2f0e331b8a5c, we get the subtitles but the timing needs to be offset to skip the first two parts (ad, intro). The subtitle file should be split to match the AV parts.

BenMcLean commented 1 year ago

For the benefit of anyone else reading this, use the --list-formats command line option (or just -F for short) to find out what formats are available.

zchrykng commented 1 year ago

Unsure if it is related to this issue, but it seems that yt-dlp doesn't select the highest quality video available automatically. I'm consistently getting the lowest quality video unless I use -f 4755 or similar to force it.

This URL for reference: https://www.angel.com/watch/wingfeather-saga/episode/6539a5c7-2a96-4393-bef7-5a5f5811430f/season-1/episode-2/a-mysterious-map

dirkf commented 1 year ago

According to the previous discussion you should list the available formats and manually select the desired format combination for this site anyway, so the automatic sort shouldn't matter.

FWIW yt-dlp 2022.11.22 lists 4755 as the best format for the quoted URL: might you have some conflicting default selection in a config file?

BenMcLean commented 1 year ago

According to the previous discussion you should list the available formats and manually select the desired format combination for this site anyway, so the automatic sort shouldn't matter.

FWIW yt-dlp 2022.11.22 lists 4755 as the best format for the quoted URL: might you have some conflicting default selection in a config file?

I make a backup of the new episode every week and I have found that I need to list formats each time to make sure I'm getting the best quality version.

BenMcLean commented 1 year ago

Just FYI for people: uppercase -F lists formats, while lowercase -f is for indicating what format(s) you want.

zchrykng commented 1 year ago

@dirkf Yeah, that is what I'm doing now. More puzzled about why it doesn't select the highest quality option automatically, since it does see it with the -F option. I don't have any top level config file setup that would be overriding any of these choices, as far as I'm aware.

dirkf commented 1 year ago

In that case you might post your verbose log.

zchrykng commented 1 year ago

@dirkf LIke this?

As you can see, the highest quality option is ID 4755, but the downloader picks ID 697, which is the lowest.

~
❯ yt-dlp.exe -F https://www.angel.com/watch/wingfeather-saga/episode/f40462e4-6377-452b-af32-01329b71649e/season-1/episode-1/leeli-the-sea-dragon-song
[Angel] f40462e4-6377-452b-af32-01329b71649e: Downloading webpage
[Angel] f40462e4-6377-452b-af32-01329b71649e: Downloading HD m3u8 information
[info] Available formats for f40462e4-6377-452b-af32-01329b71649e:
ID   EXT RESOLUTION │   FILESIZE   TBR PROTO │ VCODEC      ACODEC
────────────────────────────────────────────────────────────────────
2516 mp4 1280x720   │ ~640.46MiB 2516k m3u8  │ avc1.640020 mp4a.40.2
1640 mp4 960x540    │ ~417.55MiB 1641k m3u8  │ avc1.640020 mp4a.40.2
4755 mp4 1920x1080  │ ~  1.18GiB 4755k m3u8  │ avc1.64002a mp4a.40.2
994  mp4 640x360    │ ~253.01MiB  994k m3u8  │ avc1.64001f mp4a.40.2
697  mp4 480x270    │ ~177.42MiB  697k m3u8  │ avc1.64001e mp4a.40.2

~ took 2s
❯ yt-dlp.exe -v https://www.angel.com/watch/wingfeather-saga/episode/f40462e4-6377-452b-af32-01329b71649e/season-1/episode-1/leeli-the-sea-dragon-song
[debug] Command-line config: ['-v', 'https://www.angel.com/watch/wingfeather-saga/episode/f40462e4-6377-452b-af32-01329b71649e/season-1/episode-1/leeli-the-sea-dragon-song']
[debug] Encodings: locale cp1252, fs utf-8, pref cp1252, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2022.08.19 [48c88e0] (win32_exe)
[debug] Python 3.8.10 (CPython 64bit) - Windows-10-10.0.22621-SP0
[debug] Checking exe version: ffmpeg -bsfs
[debug] Checking exe version: ffprobe -bsfs
[debug] exe versions: ffmpeg 2022-03-03-git-72684d2c2d-full_build-www.gyan.dev (setts), ffprobe 2022-03-03-git-72684d2c2d-full_build-www.gyan.dev
[debug] Optional libraries: Cryptodome-3.15.0, brotli-1.0.9, certifi-2022.06.15, mutagen-1.45.1, sqlite3-2.6.0, websockets-10.3
[debug] Proxy map: {}
[debug] [Angel] Extracting URL: https://www.angel.com/watch/wingfeather-saga/episode/f40462e4-6377-452b-af32-01329b71649e/season-1/episode-1/leeli-the-sea-dragon-song
[Angel] f40462e4-6377-452b-af32-01329b71649e: Downloading webpage
[Angel] f40462e4-6377-452b-af32-01329b71649e: Downloading HD m3u8 information
[debug] Default format spec: bestvideo*+bestaudio/best
[info] f40462e4-6377-452b-af32-01329b71649e: Downloading 1 format(s): 697
[debug] Invoking hlsnative downloader on "https://media.angelstudios.com/copied-content/abd704a2-e009-42ad-9319-4a8473461035/playlists/f1624ff9-ab47-47a7-bada-8869ac344a06.m3u8"
[hlsnative] Downloading m3u8 manifest
[hlsnative] Total fragments: 419
[download] Destination: Watch The Wingfeather Saga Season 1, Episode 1: Leeli & The Sea Dragon Song on Angel Studios [f40462e4-6377-452b-af32-01329b71649e].mp4
[debug] File locking is not supported. Proceeding without locking
[download] 100% of 136.41MiB in 00:36 at 3.75MiB/s
[debug] ffprobe command line: ffprobe -hide_banner -show_format -show_streams -print_format json "file:Watch The Wingfeather Saga Season 1, Episode 1: Leeli & The Sea Dragon Song on Angel Studios [f40462e4-6377-452b-af32-01329b71649e].mp4"
[FixupM3u8] Fixing MPEG-TS in MP4 container of "Watch The Wingfeather Saga Season 1, Episode 1: Leeli & The Sea Dragon Song on Angel Studios [f40462e4-6377-452b-af32-01329b71649e].mp4"
[debug] ffmpeg command line: ffmpeg -y -loglevel "repeat+info" -i "file:Watch The Wingfeather Saga Season 1, Episode 1: Leeli & The Sea Dragon Song on Angel Studios [f40462e4-6377-452b-af32-01329b71649e].mp4" -map 0 -dn -ignore_unknown -c copy -f mp4 "-bsf:a" aac_adtstoasc -movflags "+faststart" "file:Watch The Wingfeather Saga Season 1, Episode 1: Leeli & The Sea Dragon Song on Angel Studios [f40462e4-6377-452b-af32-01329b71649e].temp.mp4"
bashonly commented 1 year ago

[debug] yt-dlp version 2022.08.19 [48c88e0] (win32_exe)

@zchrykng update yt-dlp

zchrykng commented 1 year ago

Huh, could have sworn I was at the latest version for my package manager. Will check in the morning.

sydlexius commented 1 year ago

When I run the following: yt-dlp -F "https://www.angel.com/watch/the-chosen/episode/b97df2cc-f4c5-48ba-904c-3667ca328cf1/season-3/episode-6/intensity-in-tent-city"

I get this output:

[Angel] Extracting URL: https://www.angel.com/watch/the-chosen/episode/b97df2cc-f4c5-48ba-904c-3667ca328cf1/season-3/epis...tensity-in-tent-city
[Angel] b97df2cc-f4c5-48ba-904c-3667ca328cf1: Downloading webpage
[Angel] b97df2cc-f4c5-48ba-904c-3667ca328cf1: Downloading HD m3u8 information
[info] Available formats for b97df2cc-f4c5-48ba-904c-3667ca328cf1:
ID EXT RESOLUTION │ PROTO │ VCODEC  ACODEC
───────────────────────────────────────────
0  mp4 unknown    │ m3u8  │ unknown unknown
1  mp4 unknown    │ m3u8  │ unknown unknown
2  mp4 unknown    │ m3u8  │ unknown unknown
3  mp4 unknown    │ m3u8  │ unknown unknown
4  mp4 unknown    │ m3u8  │ unknown unknown
5  mp4 unknown    │ m3u8  │ unknown unknown
hastyeagle commented 1 year ago

I'm running 2023.01.06 and was able to download Season 3, Episode 6 when it came out ~3 weeks ago. Today, I'm getting the same issue as @sydlexius for that episode and the newest episode (Season 3, Episode 7). So it seems something changed on the site that broke yt-dlp from being able to download.

jbfavre commented 1 year ago

I can confirm the issue. I have been able to download Episod 6 without any trouble some weeks ago. Now struggling downloading episod 7, and previous one now fails as well

> yt-dlp --verbose --list-formats --list-subs https://www.angel.com/watch/the-chosen/episode/76331538-ff46-4ef9-b96b-8e20c30ac00e/season-3/episode-7/ears-to-hear
[debug] Command-line config: ['--verbose', '--list-formats', '--list-subs', 'https://www.angel.com/watch/the-chosen/episode/76331538-ff46-4ef9-b96b-8e20c30ac00e/season-3/episode-7/ears-to-hear']
[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version 2023.01.06 [6becd2508] (debian*)
[debug] Python 3.11.1 (CPython x86_64 64bit) - Linux-6.1.0-3-amd64-x86_64-with-glibc2.36 (OpenSSL 3.0.7 1 Nov 2022, glibc 2.36)
[debug] exe versions: ffmpeg 5.1.2-2 (setts), ffprobe 5.1.2-2, rtmpdump 2.4
[debug] Optional libraries: Cryptodome-3.11.0, brotli-1.0.9, certifi-2022.09.24, mutagen-1.46.0, pyxattr-0.8.0, secretstorage-3.3.3, sqlite3-2.6.0, websockets-10.4
[debug] Proxy map: {}
[debug] Loaded 1760 extractors
[Angel] Extracting URL: https://www.angel.com/watch/the-chosen/episode/76331538-ff46-4ef9-b96b-8e20c30ac00e/season-3/episode-7/ears-to-hear
[Angel] 76331538-ff46-4ef9-b96b-8e20c30ac00e: Downloading webpage
[Angel] 76331538-ff46-4ef9-b96b-8e20c30ac00e: Downloading HD m3u8 information
[debug] Formats sorted by: hasvid, ie_pref, lang, quality, res, fps, hdr:12(7), vcodec:vp9.2(10), channels, acodec, filesize, fs_approx, tbr, vbr, abr, asr, proto, vext, aext, hasaud, source, id
76331538-ff46-4ef9-b96b-8e20c30ac00e has no subtitles
[info] Available formats for 76331538-ff46-4ef9-b96b-8e20c30ac00e:
ID EXT RESOLUTION │ PROTO │ VCODEC  ACODEC
───────────────────────────────────────────
0  mp4 unknown    │ m3u8  │ unknown unknown
1  mp4 unknown    │ m3u8  │ unknown unknown
2  mp4 unknown    │ m3u8  │ unknown unknown
3  mp4 unknown    │ m3u8  │ unknown unknown
4  mp4 unknown    │ m3u8  │ unknown unknown
5  mp4 unknown    │ m3u8  │ unknown unknown
napei commented 1 year ago

Regarding this latest error, it seems that there's a parsing issue with the URL it finds after a redirect. The verbose log is below, however my summary is that the URL is including the entire page's source and trying to send that as a request resulting in a HTTP Error 414: Request URI Too Long error which makes sense. This is happening currently to all videos on this site.

You can see here, I believe it is trying to make a request to this whole thing as one string.

[debug] Invoking hlsnative downloader on "https://www.angel.com/watch/the-chosen/episode/76331538-ff46-4ef9-b96b-8e20c30ac00e/season-3/episode-7/});</script><link rel="preload" href="/_next/static/css/eef98d695473f723.css" as="style"/><link rel="stylesheet" href="/_next/static/css/eef98d695473f723.css" data-n-g=""/><link rel="preload" href="/_next/static/css/1f1584dc20e5f71f.css"..........................

Not sure of the root cause of this to be honest, could be some funky minification on the host's end as it's been working fine so they've obviously changed something.

Verbose Log ``` yt-dlp -v https://www.angel.com/watch/the-chosen/episode/76331538-ff46-4ef9-b96b-8e20c30ac00e/season-3/episode-7/ears-to-hear [debug] Command-line config: ['-v', 'https://www.angel.com/watch/the-chosen/episode/76331538-ff46-4ef9-b96b-8e20c30ac00e/season-3/episode-7/ears-to-hear'] [debug] Portable config "C:\Users\natty\scoop\apps\yt-dlp-daily\current\yt-dlp.conf": [] [debug] Encodings: locale cp1252, fs utf-8, pref cp1252, out cp1252 (No ANSI), error utf-8, screen cp1252 (No ANSI) [debug] ytdl-patched/yt-dlp version 2023.02.06.43044 [6328c6d] (win_exe) [debug] ** This build is unofficial daily builds, provided for ease of use. [debug] ** Please do not ask for any support. [debug] Python 3.8.10 (CPython AMD64 64bit) - Windows-10-10.0.22621-SP0 (OpenSSL 1.1.1k 25 Mar 2021) [debug] exe versions: ffmpeg 5.1.2-full_build-www.gyan.dev (setts), ffprobe 5.1.2-full_build-www.gyan.dev [debug] Optional libraries: Cryptodome-3.17, brotli-1.0.9, certifi-2022.12.07, mutagen-1.46.0, sqlite3-2.6.0, websockets-10.4 [debug] Proxy map: {} [debug] Loaded 1764 extractors [debug] Formats sorted by: hasvid, ie_pref, lang, quality, res, fps, hdr:12(7), vcodec:vp9.2(10), channels, acodec, filesize, fs_approx, tbr, vbr, abr, asr, proto, vext, aext, hasaud, source, id [debug] Default format spec: bestvideo*+bestaudio/best [debug] Invoking hlsnative downloader on "https://www.angel.com/watch/the-chosen/episode/76331538-ff46-4ef9-b96b-8e20c30ac00e/season-3/episode-7/});

Episode 7

Ears to Hear

Andrew and Philip return from their trip with desperate news: they need Jesus’ help to solve a huge crisis in the Decapolis. Jesus leads them on a trip to the dangerous region, where they face opposition from all sides. Literally. Meanwhile, John is assigned to bring an angry Simon to Jesus.

Episodes

Our Shows

" ERROR: unable to download video data: HTTP Error 414: Request URI Too Long Traceback (most recent call last): File "yt_dlp\YoutubeDL.py", line 3236, in process_info File "yt_dlp\YoutubeDL.py", line 2959, in dl File "yt_dlp\downloader\common.py", line 444, in download File "yt_dlp\downloader\hls.py", line 59, in real_download File "yt_dlp\YoutubeDL.py", line 3730, in urlopen File "urllib\request.py", line 531, in open File "urllib\request.py", line 640, in http_response File "urllib\request.py", line 569, in error File "urllib\request.py", line 502, in _call_chain File "urllib\request.py", line 649, in http_error_default urllib.error.HTTPError: HTTP Error 414: Request URI Too Long ```
dirkf commented 1 year ago

The site is now incorrectly sending the page URL as the contentUrl in the ld+json VideoObject that is used in the extraction; the extractor didn't verify it as manifest URL before trying to use it. Now the manifest URL is in the NextJS hydration data. The patch above is updated to deal with it.

jakehathaway commented 1 year ago

Sorry for being a beginner. Where can I get the patch to try and build my own dist. And is there a timeline for the next release and will it include that patch? Thanks!

BenMcLean commented 1 year ago

The patch above is updated to deal with it.

This sentence suggests that perhaps you forgot to include a link to the patch?

pukkandan commented 1 year ago

https://github.com/yt-dlp/yt-dlp/issues/5478#issuecomment-1325093057

zchrykng commented 1 year ago

@dirkf is there a reason that this patch is just here on the issue rather than a pull request? At least not that I was able to find. More than happy to look into putting the PR together, but didn't want to step on any toes.

pukkandan commented 1 year ago

The patch doesn't fully fix the issue. Or am I mistaken?

zchrykng commented 1 year ago

It at least appears to fix the issue on my end, but I haven't tested it extensively.

paul1149 commented 1 year ago

Using 2023.01.06 I get no discernible download candidates. At the sister site, thechosen.tv, I get virtually nothing at all.

jakehathaway commented 1 year ago

#5478 (comment)

I am sure I am just an idiot. I can see the diff in the thread, I see nowhere to download the updated file. Here is what I did instead. I used ffmpeg to download the files without the wrapper of yt-dlp. I am not a real developer and I was getting errors trying to setup a build environment anyway. ffmpeg isn't as straight forward as yt-dp, but I was able to get through it. I will wait for this patch to get into a release and test when it is updated. Thanks for all the help and working on such a great product!

zchrykng commented 1 year ago

#5478 (comment)

I am sure I am just an idiot.

No, it just isn't obvious.

You need to save the patch as a file and then apply it to the angel.py file in your site-packages where yt-dlp is installed. You can do that with patch angel.py ~/yt-dlp.patch after you have found the file, if you have a *nix style system.

jschwalbe commented 1 year ago

@AxiosDeminence hi, you seem to know what you're doing. can you check into the HTTP Error 414: Request URI Too Long error that's going on? There is a patch above which apparently works, but it would be awesome if it could get into the actual binary. Thoughts? thanks!

twiclo commented 1 year ago

I'm now getting a 400

[twiclo@toph The Chosen (2017)]$ yt-dlp https://www.angel.com/watch/the-chosen/episode/76331538-ff46-4ef9-b96b-8e20c30ac00e/season-3/episode-7/ears-to-hear
[Angel] Extracting URL: https://www.angel.com/watch/the-chosen/episode/76331538-ff46-4ef9-b96b-8e20c30ac00e/season-3/episode-7/ears-to-hear
[Angel] 76331538-ff46-4ef9-b96b-8e20c30ac00e: Downloading webpage
[Angel] 76331538-ff46-4ef9-b96b-8e20c30ac00e: Downloading HD m3u8 information
[info] 76331538-ff46-4ef9-b96b-8e20c30ac00e: Downloading 1 format(s): 5
[hlsnative] Downloading m3u8 manifest
ERROR: unable to download video data: HTTP Error 400: Bad Request
jaded33 commented 1 year ago

I am getting both errors 400 and 414 on different videos on the Angel Studios site. "Testament" and "Wingfeather Saga"

lologhi commented 1 year ago

Hi! Same issue here, and here is the detail :

$ yt-dlp -vU https://www.angel.com/watch/the-chosen/episode/c853c701-abdb-4075-8fe7-1e29f6f9c6aa/season-1/episode-3/jesus-loves-the-little-children
[debug] Command-line config: ['-vU', 'https://www.angel.com/watch/the-chosen/episode/c853c701-abdb-4075-8fe7-1e29f6f9c6aa/season-1/episode-3/jesus-loves-the-little-children']
[debug] Encodings: locale UTF-8, fs utf-8, pref UTF-8, out utf-8, error utf-8, screen utf-8
[debug] yt-dlp version stable@2023.03.04 [392389b7d] (pip)
[debug] Python 3.11.2 (CPython arm64 64bit) - macOS-13.2.1-arm64-arm-64bit (OpenSSL 1.1.1t  7 Feb 2023)
[debug] exe versions: ffmpeg 5.1.2 (setts), ffprobe 5.1.2
[debug] Optional libraries: Cryptodome-3.17, brotli-1.0.9, certifi-2022.12.07, mutagen-1.46.0, sqlite3-2.6.0, websockets-10.4
[debug] Proxy map: {}
[debug] Loaded 1786 extractors
[debug] Fetching release info: https://api.github.com/repos/yt-dlp/yt-dlp/releases/latest
Available version: stable@2023.03.04, Current version: stable@2023.03.04
yt-dlp is up to date (stable@2023.03.04)
[Angel] Extracting URL: https://www.angel.com/watch/the-chosen/episode/c853c701-abdb-4075-8fe7-1e29f6f9c6aa/season-1/episode-3/jesus-loves-the-little-children
[Angel] c853c701-abdb-4075-8fe7-1e29f6f9c6aa: Downloading webpage
[Angel] c853c701-abdb-4075-8fe7-1e29f6f9c6aa: Downloading HD m3u8 information
[debug] Formats sorted by: hasvid, ie_pref, lang, quality, res, fps, hdr:12(7), vcodec:vp9.2(10), channels, acodec, filesize, fs_approx, tbr, vbr, abr, asr, proto, vext, aext, hasaud, source, id
[debug] Default format spec: bestvideo*+bestaudio/best
[info] c853c701-abdb-4075-8fe7-1e29f6f9c6aa: Downloading 1 format(s): 5
[debug] Invoking hlsnative downloader on "https://www.angel.com/watch/the-chosen/episode/c853c701-abdb-4075-8fe7-1e29f6f9c6aa/season-1/episode-3/});</script><link rel="preload" ## HERE COMES NEARLY ALL THE WEB PAGE ## </script></body></html>"

I've removed a huge piece (95KB) that was in the Invoking hlsnative downloader on command. Then it fails :

[hlsnative] Downloading m3u8 manifest
ERROR: unable to download video data: HTTP Error 400: Bad Request
Traceback (most recent call last):
  File "/opt/homebrew/Cellar/yt-dlp/2023.3.4/libexec/lib/python3.11/site-packages/yt_dlp/YoutubeDL.py", line 3247, in process_info
    success, real_download = self.dl(temp_filename, info_dict)
                             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/yt-dlp/2023.3.4/libexec/lib/python3.11/site-packages/yt_dlp/YoutubeDL.py", line 2970, in dl
    return fd.download(name, new_info, subtitle)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/yt-dlp/2023.3.4/libexec/lib/python3.11/site-packages/yt_dlp/downloader/common.py", line 444, in download
    ret = self.real_download(filename, info_dict)
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/yt-dlp/2023.3.4/libexec/lib/python3.11/site-packages/yt_dlp/downloader/hls.py", line 66, in real_download
    urlh = self.ydl.urlopen(self._prepare_url(info_dict, man_url))
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/yt-dlp/2023.3.4/libexec/lib/python3.11/site-packages/yt_dlp/YoutubeDL.py", line 3742, in urlopen
    return self._opener.open(req, timeout=self._socket_timeout)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.11/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 525, in open
    response = meth(req, response)
               ^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.11/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 634, in http_response
    response = self.parent.error(
               ^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.11/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 563, in error
    return self._call_chain(*args)
           ^^^^^^^^^^^^^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.11/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 496, in _call_chain
    result = func(*args)
             ^^^^^^^^^^^
  File "/opt/homebrew/Cellar/python@3.11/3.11.2_1/Frameworks/Python.framework/Versions/3.11/lib/python3.11/urllib/request.py", line 643, in http_error_default
    raise HTTPError(req.full_url, code, msg, hdrs, fp)
urllib.error.HTTPError: HTTP Error 400: Bad Request

Thanks for your help!


Edit: I've tried to get a media URL in the source page and then did a:

yt-dlp https://media.angelstudios.com/copied-content/0c5f8cdd-bca9-46a6-8c5e-113173f74acc/playlists/9860ff3c-df86-4028-8be2-260639b49595.m3u8

And it worked well. Though it might be quite useless…

BenMcLean commented 1 year ago

Any word on this?

BenMcLean commented 1 year ago

Does this need to get a new issue?