ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
131.17k stars 9.93k forks source link

Fatal error messages should remind me about 429 errors. #32706

Open mk-pmb opened 7 months ago

mk-pmb commented 7 months ago

Checklist

Description

I put too much effort into trying to report what seemed to be a bug, only to then discover in the log that there was a "HTTP Error 429: Too Many Requests" earlier. Can we make it so that any non-fatal 429 sets a kind of "tainted" flag and have all subsequent fatal errors start their message with something like "This might be caused by an earlier error: HTTP Error 429: Too Many Requests –"? That way I'd instantly see I can ignore the "please report this bug" part.

dirkf commented 7 months ago

This seems like a case where the non-fatal error shouldn't have been, but you'd have to provide more details, as in the bug report template(s).

mk-pmb commented 7 months ago

Not a bug because the problem is that my proxy's IP is temporarily blacklisted due to too other users' behaviors. It works via another proxy.

Verbose log ```text [debug] System config: [] [debug] User config: [] [debug] Custom config: [] [debug] Command-line args: ['--no-call-home', '--abort-on-error', '--no-overwrites', '--keep-video', '--fixup=warn', '--prefer-ffmpeg', '--verbose', '--', 'http://youtu.be/ZQeSeYNEwHM'] [debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8 [debug] youtube-dl version 2021.12.17 [debug] Git HEAD: dc512e3a8 [debug] Python 3.8.10 (CPython x86_64 64bit) - #### [debug] exe versions: ffmpeg 4.2.7, ffprobe 4.2.7 [debug] Proxy map: {####} [youtube] ZQeSeYNEwHM: Downloading webpage WARNING: Unable to download webpage: HTTP Error 429: Too Many Requests [youtube] ZQeSeYNEwHM: Downloading API JSON ERROR: Cannot decrypt nsig without player_url; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see https://yt-dl.org/update on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output. Traceback (most recent call last): File "/####/youtube_dl/YoutubeDL.py", line 859, in wrapper return func(self, *args, **kwargs) File "/####/youtube_dl/YoutubeDL.py", line 955, in __extract_info ie_result = ie.extract(url) File "/####/youtube_dl/extractor/common.py", line 565, in extract ie_result = self._real_extract(url) File "/####/youtube_dl/extractor/youtube.py", line 2108, in _real_extract self._unthrottle_format_urls(video_id, player_url, dct) File "/####/youtube_dl/extractor/youtube.py", line 1725, in _unthrottle_format_urls n_response = decrypt_nsig(n_param)(n_param, video_id, player_url) File "/####/youtube_dl/extractor/youtube.py", line 1614, in inner raise ret File "/####/youtube_dl/extractor/youtube.py", line 1606, in inner self._player_cache[cache_id] = func(*args, **kwargs) File "/####/youtube_dl/extractor/youtube.py", line 1634, in _decrypt_nsig raise ExtractorError('Cannot decrypt nsig without player_url') youtube_dl.utils.ExtractorError: Cannot decrypt nsig without player_url; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see https://yt-dl.org/update on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output. E: error type: 'DownloadError' msg: 'ERROR: Cannot decrypt nsig without player_url; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see https://yt-dl.org/update on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.' (2, 'DownloadError', ('ERROR: Cannot decrypt nsig without player_url; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see https://yt-dl.org/update on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output.',)) ```

The Unable to download webpage message exists in

youtube_dl/extractor/common.py:687:                errnote = 'Unable to download webpage'
youtube_dl/extractor/itv.py:51:            errnote = 'Unable to download webpage'

so probably common.py should be the place to set the taint flag (probably in self._downloader.report_warning), and the other should be refactored to use a common(.py) annotation function.

dirkf commented 7 months ago

The ITV extractor probably fails anyway, and the default errnote didn't need to be re-specified.

You would have to maintain a list of (host, port) that had experienced 429. A good place to set that up might be in _request_webpage(). But I reiterate the original point. Why is this error being treated as non-fatal, if the result of not receiving a valid response is a later error? Let's see the log.

mk-pmb commented 7 months ago

Let's see the log.

Have you seen the verbose log in the details tag? I wish GitHub would make their "click to open" arrow a bit more visible. Or what more log would you need?

dirkf commented 7 months ago

AHA, sorry. In this case, the YT webpage failed with 429 but yt-dl was able to fetch the JSON which is supposed to be enough to get the video, though not to resolve the n-sig, or get the full metadata.

Which actual code are you running there? The logic was updated in the most recent nightly. One possibility, without checking, is that the failure during n-sig was introduced, or not removed; another is that it more properly proceeds with a warning.

mk-pmb commented 7 months ago

[debug] youtube-dl version 2021.12.17 [debug] Git HEAD: dc512e3a8

iirc I had pulled before reproducing for the log. If it happens again, what version information should I collect?

dirkf commented 7 months ago

Yes, so I think this patch should be applied to treat the first failure in the routine in the same way as the later cases:

--- old/youtube-dl/youtube_dl/extractor/youtube.py
+++ new/youtube-dl/youtube_dl/extractor/youtube.py
@@ -1631,7 +1631,9 @@
     def _decrypt_nsig(self, n, video_id, player_url):
         """Turn the encrypted n field into a working signature"""
         if player_url is None:
-            raise ExtractorError('Cannot decrypt nsig without player_url')
+            self.report_warning(
+                'Cannot decrypt nsig without player_url', video_id)
+            return

         try:
             jsi, player_id, func_code = self._extract_n_function_code(video_id, player_url)
@@ -1646,11 +1648,9 @@
             ret = extract_nsig(jsi, func_code)(n)
         except JSInterpreter.Exception as e:
             self.report_warning(
-                '%s (%s %s)' % (
-                    self.__ie_msg(
-                        'Unable to decode n-parameter: download likely to be throttled'),
-                    error_to_compat_str(e),
-                    traceback.format_exc()))
+                'Unable to decode n-parameter: download likely to be throttled (%s %s)' % (
+                    error_to_compat_str(e), traceback.format_exc()),
+                video_id)
             return

         self.write_debug('Decrypted nsig {0} => {1}'.format(n, ret))

The second hunk is an unrelated clean-up!

mk-pmb commented 7 months ago

Thanks! I will try it. Edit 2024-02-03: Error 429 vanished when applying the patch. Probably coincidence. I'll have to wait until it reappears.

mk-pmb commented 7 months ago

The patch works nicely. It seems to replace the one specific fatal error above with downloading anyway:

[youtube] ZQeSeYNEwHM: Downloading webpage
WARNING: Unable to download webpage: HTTP Error 429: Too Many Requests
[youtube] ZQeSeYNEwHM: Downloading API JSON
WARNING: [youtube] ZQeSeYNEwHM: Cannot decrypt nsig without player_url
WARNING: [youtube] ZQeSeYNEwHM: Cannot decrypt nsig without player_url
[youtube] ZQeSeYNEwHM: Downloading API JSON
[info] Writing video description metadata as JSON to: …
[debug] Invoking downloader on 'https://…'
[download] Destination: Big_Buck_Bunny_animation_student.mp4

Which is good and we should get this into master. Thanks!

However, now without that error, I can't tell whether it would achieve the actual feature requested above. Would it annotate any other fatal errors that may occurr after a 429 error?

dirkf commented 7 months ago

Such cases should be resolved in the same way, if they occur: (a) if the 429 will prevent extraction it should be an error; (b) if not it should be a warning, unless (c) exceptionally, the fallback tactics result in the same quality of extraction, where it might just be ignored or logged for debug. The problem here was poor logic such that although (b) should have applied the extraction failed anyway.

mk-pmb commented 7 months ago

I'm not entirely sure I understand it but I think you mean that even with a 429 error, fatal errors should still be reported as a bug, so the message saying so is actually correct. In that case, my feature request was misguided and you can just close it.

dirkf commented 7 months ago

Yes, that was my point. The issue could be left open until I push the change (actually there's a slicker version now).

mk-pmb commented 7 months ago

I have a new case, which looks a LOT like yet another dupe of 429 + ERROR: No video formats found and as such, not worthy of reporting. However they all seem to have WARNING: unable to extract player URL (e.g. #32635 → #32630) or ERROR: Unable to extract yt initial data on playlist (e.g. #32499) which I don't have, so I'm not 100% sure.

It's not about the patch either, as I was able to download the same video earlier.

Verbose log (click to open) ```text [debug] System config: [] [debug] User config: [] [debug] Custom config: [] [debug] Command-line args: ['--no-call-home', '--abort-on-error', '--no-overwrites', '--keep-video', '--fixup=warn', '--prefer-ffmpeg', '--format=22', '--verbose', '--', '71WCJh7WKrM'] [debug] Encodings: locale UTF-8, fs utf-8, out utf-8, pref UTF-8 [debug] youtube-dl version 2021.12.17 [debug] Git HEAD: dc512e3a8 [debug] Python 3.8.10 (CPython x86_64 64bit) - Linux-…-x86_64-with-glibc… - OpenSSL 1.1.1f 31 Mar 2020 - glibc … [debug] exe versions: ffmpeg 4.2.7, ffprobe 4.2.7 [debug] Proxy map: {…} [youtube] 71WCJh7WKrM: Downloading webpage WARNING: Unable to download webpage: HTTP Error 429: Too Many Requests [youtube] 71WCJh7WKrM: Downloading API JSON ERROR: No video formats found; please report this issue on https://yt-dl.org/bug . Make sure you are using the latest version; see https://yt-dl.org/update on how to update. Be sure to call youtube-dl with the --verbose flag and include its complete output. Traceback (most recent call last): File "/…/youtube_dl/YoutubeDL.py", line 859, in wrapper: return func(self, *args, **kwargs) File "/…/youtube_dl/YoutubeDL.py", line 955, in __extract_info: ie_result = ie.extract(url) File "/…/youtube_dl/extractor/common.py", line 565, in extract: ie_result = self._real_extract(url) File "/…/youtube_dl/extractor/youtube.py", line 2221, in _real_extract: self._sort_formats(formats) File "/…/youtube_dl/extractor/common.py", line 1494, in _sort_formats: raise ExtractorError('No video formats found') youtube_dl.utils.ExtractorError: No video formats found; please report this issue on […] ```
dirkf commented 7 months ago

Well, #23638 applies.

In this case the JSON contains streaming formats but they only contain encrypted media links ("signature cipher"). The extractor needs the player_url to resolve these; that URL in turn is only available from the webpage that could not be downloaded. The code recognises this and skips those formats. If it didn't you would also get warning messages like this:

WARNING: [youtube] 71WCJh7WKrM: Signature extraction failed: Some formats may be missing

Would that be better? We now have the only_once parameter for warnings so the output needn't be spammed as it might have been when the signature processing was implemented.

mk-pmb commented 7 months ago

Well, https://github.com/ytdl-org/youtube-dl/issues/23638 applies.

Yeah I read that, and am rather confident that the issue is redundantly not worth reporting, especially due to lack of reproducible console access. Which means we probably have a case where my original feature request is useful.

Would that be better?

Probably not. "Some formats may be missing" to me sounds like "YouTube doesn't like you(r proxy)", for which my default route of action would be to use the next proxy in my list. Thus, my use is actually twofold:

  1. Avoid the effort for manually re-running verbosely and reporting;
  2. Make it easier to guess from the final error whether I should try the next proxy. I already have a wrapper that communicates the final error message to my download manager, and have tried to use full log capture to also get previous 429 messages, but it's messy and has side effects. Having the indication in the fatal error would make it much easier.

Update 2024-05-15: I investigated how I might be able to monkey-patch a flag into it. It seems there are three separate report_warning functions, with the one in youtube_dl/YoutubeDL.py being the relevant one. (The one in extractors seems to delegate to the downloader, and the one in the downloader seems to delegate to the main file.) Was a good opportunity to refactor it.

Based on the message, the 429 error in my verbose log must be from _request_webpage in extractor/common.py (called without errnote, thus equivalent to errnote=None), which repackages errors caught from self._downloader.urlopen. The latter is a hot candidate for what to monkey-decorate. Another approach would be to configure my own logger and match based on strings.