ytdl-org / youtube-dl

Command-line program to download videos from YouTube.com and other video sites
http://ytdl-org.github.io/youtube-dl/
The Unlicense
131.62k stars 9.97k forks source link

Format selection should not prefer unreliable 'tbr' metadata on YouTube #14143

Open glenn-slayden opened 7 years ago

glenn-slayden commented 7 years ago

The "tbr" value in the YouTube JSON information, also reported in the YouTubeDL -F option, contains the boilerplate values 1155 (for format 135) or 2200 and 2310 (for format 136) when the bitrate is actually unknown. These special sentinal values are thus unrelated to the actual bitrate of the video and should not be interpreted as such, especially when comparing to other formats with valid bitrate values.

This pattern can be seen by scrolling in the following files. which aggregate f135/f136 reports for 60,134 YouTube items. The lines are sorted by reported bitrate, and you can note the very large gaps in the rightmost "filesize" column which are sections corresponding to the singular suspicious values mentioned above. The number of files in these gap areas is too large for the corresponding value to always be exactly "1155", "2200", or "2310" by coincidence.

http://www.blobule.com/webshare/ytf-135.txt http://www.blobule.com/webshare/ytf-136.txt

Furthermore, many of the items advertising the aforementioned values have been spot-checked after download, and thier bitrates found to be unrelated to the values shown.

YouTubeDL currently does not detect and ignore these special case values. This often causes the -f bestvideo selection, which prioritizes higher bitrate, to erroneously choose the wrong format.

Please adjust the bestvideo format selection heuristic so that it ignores the bitrate when these special numeric values are seen, and defers instead to a secondary mechanism, such as pixel dimension, for the bestvideo determination.

At a minimum, bestvideo should corroborate that its final selection is credible vis-a-vis the other candidates. For example, consider YouTube video Ot5jszj_ny8, for which youtube-dl reports the following formats:

format code  extension  resolution note
139          m4a        audio only DASH audio   49k , m4a_dash container, mp4a.40.5@ 48k (22050Hz)
140          m4a        audio only DASH audio  128k , m4a_dash container, mp4a.40.2@128k (44100Hz)
160          mp4        256x144    DASH video  108k , avc1.4d400b, 30fps, video only
134          mp4        640x360    DASH video  117k , avc1.4d401e, 30fps, video only
133          mp4        426x240    DASH video  242k , avc1.4d400c, 30fps, video only
136          mp4        1280x720   DASH video  365k , avc1.4d401f, 30fps, video only
135          mp4        854x480    DASH video 1155k , avc1.4d4014, 30fps, video only
17           3gp        176x144    small , mp4v.20.3, mp4a.40.2@ 24k
36           3gp        320x180    small , mp4v.20.3, mp4a.40.2
43           webm       640x360    medium , vp8.0, vorbis@128k
18           mp4        640x360    medium , avc1.42001E, mp4a.40.2@ 96k
22           mp4        1280x720   hd720 , avc1.64001F, mp4a.40.2@192k (best)

Here -f bestvideo will download format 135, which, at 854×480, has an actual video bitrate of only 145 kb/s (according to MediaInfo), instead of format 136, which is indeed 1280×720 with a considerably better rate of 169 kb/s. Even with different compression levels, it is simply not credible that a format with 55% fewer pixels could have a bitrate 3x higher for the exact same source content.

Another case, YouTube MUMlwUe-BCo was reported in #14010:

136          mp4        1280x720   DASH video 1114k , avc1.4d401f, 30fps, video only
135          mp4        854x480    DASH video 1155k , avc1.4d4014, 30fps, video only

Again here, format 136 was dis-preferred by -f bestvideo, but after downloading, the true bitrates were found to be:

glenn-slayden commented 7 years ago

Though perhaps I should have written it all here, I have outlined further thoughts on this in a comment to #6018.

However, the point i make there is most forcefully relevant to this issue, so I'm providing a direct link to that comment here.

ddawson commented 7 years ago

Responding to what you said in #14010:

Being stuck with these bugs, is there any way to detect these situations where the metadata makes blatantly incredulous claims such as in this case, a format with 55% fewer pixels supposedly having a bitrate 3x higher?

Well, what about using the resolution instead of the bitrate? Is it ever the case that a larger stream has a lower quality? Within the same format, at least? Doesn't seem likely to me.

Again, is it possible that, for formats '''135''' and '''136''' (others?), any JSON tbr values that start with "11xx" should be ignored

Not sure that's a good idea. There are actual streams with such high bitrates, and much higher.

Actually, I went looking for high bitrate videos. I found IZJ48x7LsHg, which has a "25 Mbit" 12.5 Mbit stream (at 3840x2160 resolution), so yeah. Also, the rate reported for format 135 in this case is 1146k (actual avg. rate 552k). A bit different from the supposedly boilerplate value. I think YouTube's encoder is simply giving bad values, which could be for a variety of reasons. You'd think an encoder could give an accurate value for the video it just encoded, but apparently not. Or the value is pulled from something else.

fireattack commented 7 years ago

Agreed (as I mentioned before: https://github.com/rg3/youtube-dl/issues/6018#issuecomment-312967136) that at least for YouTube, we should use resolution as the first indicator or metric for quality. AFAIK, YouTube never upscales video, so no concern on that the higher resolution ones could be less pristine; also, the resolution information from API is also very accurate and more reliable (compared to bitrate or filesize).

But from what I gathered these years (correct me if I'm wrong), the developers acknowledged this problem and were not against the change, just lack of manpower. With that said, I think we should have at least one open issue (like this one) for this very problem, instead of keeping closing relevant issues from uninformed people or hijacking other issue (like #6018 :D).

glenn-slayden commented 7 years ago

@ddawson Yes, I'm aware that my issue above isn't clear on proposing a definitive solution. First of all, as I explained here (including some 5th-grade math), bitrate should be summarily disqualified from the -f bestvideo determination at the outset.

But besides that, referring to those tables I linked (135, 136), I had started to suspect that any "bitrate" value where the rightmost column is blank must be considered unreliable. Taken together, this means we always ignore bitrate altogether, and it's the latter signal--a blank value for the file size--that would raise the alarm bells.

So from there options start to emerge. Basically, always ignore bitrate, and use reported filesize (from the rightmost column) instead, for the following two reasons:

--- or ---

I favor the latter, and it's actually quite hard to debunk. Note that this formula takes advantage of the like-duration property of the f bestvideo task to just multiply-in the fps (as opposed to fps × duration, since again, duration cancels across all of the comparison candidates). For comparison amongst alternative versions of the same source material, this formula nails down the notion of best pretty irrefutably, leaving only issues of decoder efficiency/loss, which are likely to be swamped by the contribution of these 3 factors.

I challenge anyone to present a -F output for YouTube where the maximal width × height × fps entry does not correlate with the notion of "best." Failing the appearance of such, I suggest that the onus is on any defenders of the current "bitrate" behavior to refute the points above.

And do note that this proposal (and points in the footnote as well) is only proposed for the YouTube platform, as a workaround to address serious and known metadata discrepencies (as discussed) that are specific to that site.


n.b. A prior argument has suggested that the current behavior won't be changed because people will never "reach a consensus... [on] quality." To this, I respectfully point out that the luxury of consensus only applies when based on reliable choice. In this case, the YouTube bug of failing to produce reliable bitrate information--nay worse, covertly providing it with falsified values--has a tangible, calamtous effect on the -f bestvideo result that, as a demonstrable error, falls clearly outside the regime of fair and legitimate opinion or "consensus."

Nobody's personal desire when using bestvideo is ever to receive a random file, which is part of the current behavior. Bitrates are never intended to be incorrect, and delivering an unintended result, by definition, is a bug. So it doesn't seem like using the 'variety of people's opinions' as an excuse for a mundane technological error with a deterministic cause is quite consistent this app's highest principles of excellence.

glenn-slayden commented 7 years ago

In case anyone wants a temporary workaround for the YouTube-specific bug under discussion here, where specifying -f bestvideo downloads a video format which is obviously not the best available quality, I found that nulling out the tbr value from the format dictionary solves the problem for me:

youtube_dl\extractor\common.py (1074)

for f in formats:
     # Automatically determine tbr when missing based on abr and vbr (improves
     # formats sorting in some cases)

-    if 'tbr' not in f and f.get('abr') is not None and f.get('vbr') is not None:
-        f['tbr'] = f['abr'] + f['vbr']

+    #if 'tbr' not in f and f.get('abr') is not None and f.get('vbr') is not None:
+    #    f['tbr'] = f['abr'] + f['vbr']
+    f['tbr'] = None

Before: (current behavior) Errors include 133 is better than 134 (it's not), and [edit: actually, at 55 vs. 59 kb/s they're much closer than the tbr indicates, and the case is also complicated by variable vs. constant frame rate, but technically not an error since the 133 file is 3.7% larger; anyway, the rest is still true...] 135 is better than 136 (no chance, and in fact false; details in my initial posting above).

[info] Available formats for Ot5jszj_ny8:
format code  extension  resolution note
139          m4a        audio only DASH audio   49k , m4a_dash container, mp4a.40.5@ 48k (22050Hz)
140          m4a        audio only DASH audio  128k , m4a_dash container, mp4a.40.2@128k (44100Hz)
160          mp4        256x144    DASH video  108k , avc1.4d400b, 30fps, video only
134          mp4        640x360    DASH video  117k , avc1.4d401e, 30fps, video only
133          mp4        426x240    DASH video  242k , avc1.4d400c, 30fps, video only
136          mp4        1280x720   DASH video  365k , avc1.4d401f, 30fps, video only
135          mp4        854x480    DASH video 1155k , avc1.4d4014, 30fps, video only
17           3gp        176x144    small , mp4v.20.3, mp4a.40.2@ 24k
36           3gp        320x180    small , mp4v.20.3, mp4a.40.2
43           webm       640x360    medium , vp8.0, vorbis@128k
18           mp4        640x360    medium , avc1.42001E, mp4a.40.2@ 96k
22           mp4        1280x720   hd720 , avc1.64001F, mp4a.40.2@192k (best)

After: (with 3-line hack shown above)

[info] Available formats for Ot5jszj_ny8:
format code  extension  resolution note
139          m4a        audio only DASH audio , m4a_dash container, mp4a.40.5@ 48k (22050Hz)
140          m4a        audio only DASH audio , m4a_dash container, mp4a.40.2@128k (44100Hz)
160          mp4        256x144    DASH video , avc1.4d400b, 30fps, video only
133          mp4        426x240    DASH video , avc1.4d400c, 30fps, video only
134          mp4        640x360    DASH video , avc1.4d401e, 30fps, video only
135          mp4        854x480    DASH video , avc1.4d4014, 30fps, video only
136          mp4        1280x720   DASH video , avc1.4d401f, 30fps, video only
17           3gp        176x144    small , mp4v.20.3, mp4a.40.2@ 24k
36           3gp        320x180    small , mp4v.20.3, mp4a.40.2
43           webm       640x360    medium , vp8.0, vorbis@128k
18           mp4        640x360    medium , avc1.42001E, mp4a.40.2@ 96k
22           mp4        1280x720   hd720 , avc1.64001F, mp4a.40.2@192k (best)

Notice that now, not only is the worst-to-best ordering is more in line with expectations, but those grossly unreliable bitrate values are also totally gone from the table listing.


[edit:] If the 133 over 134 preference for the smaller frame size is correct, then we would have to judge an error in the "after" list. However, with the lower quality formats we expect the distinctions to be closer, so their ranking is more sensitive to the influence of fixed-size headers and encoder anomolies. For this case, the choice is between 3.7% more content bits (f133) versus 25.3% more pixels (f134). I feel like I know which one I'd prefer.

But as I've tried to stress, these "close calls" in the lower-quality formats are not what the issue is about, and the way we know that is because we're discussing them with certain and reliable knowledge of the tradeoffs. Taking actual stream size as arbiter, the fact that the current code got the 133/134 order "right" (the former is 6.0% bigger) should, in truth, be considered dumb luck, since that decision was solely based on data that claimed it would be 107% bigger. In billiards, pocketing a shot doesn't count if you didn't call it.

So sure, those truly ambiguous cases are the ones that would then start to get into, as I stated earlier, "the regime of fair and legitimate opinion." But the focus instead here for this issue should very clearly remain on the bug which robs us of informed choice, and what we can actually do to mitigate or remediate its effects.

It may turn out that it's actually not be possible to derive any 100%-reliable predictor of anything, based on the vagaries of YouTube metadata. If that's the case then YouTubeDL's format selection policy, in order to provide maximum added value its user, should consider adopting a deliberately hostile posture towards that provider's metadata. It would be important to acknowledge this stance because the design decision calculus turn out different.

For example, if one admits that neither the (133/134) nor (135/136) ranking intention will ever be corroborated with 100% reliability after download, it becomes easier to justify the informed sacrifice of the more marginal case in favor of gaining at least some degree of certainty in the case with the more visible or serious consequences. This largely matches the drastic--but not unreasonable--philosophy of fully ignoring all YouTube tbr values at all times; if doing so means YouTubeDL must revert to an alternate method that is also known to be unreliable sometimes--but where that unreliability is constrained in certain known ways, or provably less harmful--then this is a huge improvement over the current YouTubeDL tack of raising no challenge or impediment to even the most simplistic contradictions or incredulous claims of the metadata, such as those that result in the rejection of 1280×720 at 169kb/s in favor of 854×480 at 145 kb/s (Ot5jszj_ny8).

Most importantly, by subjugating the spurious external corruption under a conservatively re-imagined, and thus reliably predictable--albeit now necessarily probabilistic--format selection ethos, YouTubeDL can mask the long-standing problem behavior of the provider, earning for itself the added value of restoring bug-free control and predictability for its own users.

Hrxn commented 7 years ago

Agreed.

width × height × fps

Yes, I concur. And true to the old adage "more pixels, more better"..

But one issue remains here:

MP4/AVC vs. WEBM/VP9x

rautamiekka commented 7 years ago

^

MP4/AVC vs. WEBM/VP9x

WEBM tends to have worse quality than MP4, but it varies: sometimes WEBM's quality at same or even higher resolution is much worse than MP4 and the file size could be anything, other times not much worse but the file size could be anything; I've had many WEBM files where WEBM was superior in the sense the quality wasn't that much lower it would make a real difference, but the file size was smaller, but also the same happened where the file is much bigger.

glenn-slayden commented 7 years ago

I think YouTube's encoder is simply giving bad values, which could be for a variety of reasons. You'd think an encoder could give an accurate value for the video it just encoded, but apparently not.

I think it's more likely that YouTube computes the reported tbr value by using the data size field (which appears in the rightmost column) as a numerator. When the data size is not available for some reason, the tbr computation falls back to some kind of extremely crude approximation, possibly using values from a lookup table.

This would explain the clustering of false/invalid tbr values that I noted (i.e., "1155" or "2200"/"2310"). It would also suggest that whenever filesize is not reported, the 'tbr' value should be ignored, since in those cases we would possess just as much information as YouTube did, and so we can make our own approximations 'thank you very much.'

ddawson commented 7 years ago

When the data size is not available for some reason, the tbr computation falls back to some kind of extremely crude approximation, possibly using values from a lookup table.

You might be right about that. But how do you explain this one (from aforementioned IZJ48x7LsHg):

135          mp4        854x480    DASH video 1146k , avc1.4d401e, 25fps, video only, 19.26MiB

Has a stated (and accurate) filesize, but given tbr is still off by over 2x (it's actually ~552k). Ridiculous.

whenever filesize is not reported, the 'tbr' value should be ignored

That's okay, I suppose. Or rather, per my example, maybe it should just be completely ignored?

glenn-slayden commented 7 years ago

maybe [YouTube-reported 'tbr' values] should just be completely ignored?

I agree 100%.

lilydjwg commented 4 years ago

As you can see in #25925, even when the actual tbr is higher, it doesn't mean the video quality is better: the codec may be more efficient.

From video 'https://www.youtube.com/watch?v=iygjJ8M7jnM', format 22 tbr reportedly at 362k and 367 kb/s by ffprobe, is better than format 18 tbr reportedly at 373k and 377 kb/s by ffprobe. The former is much clearer. (just view it fullscreen!)

I suggest sorting prefer dimensions than tbr / filesize, because video sites provide videos of better quality with higher (or equal) dimensions.

glenn-slayden commented 4 years ago

@lilydjwg wrote:
I suggest sorting prefer dimensions than tbr / filesize, because video sites provide videos of better quality with higher (or equal) dimensions.

Unfortunately, this might not be a solution either. Videos with larger pixel dimensions can have much lower resolution quality, typically by showing huge compression-artifact blocks... (note that I'm just stating a fact here, I don't have an actual case from yt).

fireattack commented 4 years ago

Videos with larger pixel dimensions by definition have higher resolution, I have no idea what you're saying.

Can these higher resolution versions be upscaled? Of course. But they're still "higher resolution" (and I don't think YT do up-scaling.)

glenn-slayden commented 4 years ago

@fireattack Thanks, I clarified my post.

And a follow-up question: Is it possible on YT for the uploading user to explicitly supply multiple source video files (corresponding to different YT formats), or does YT always generate those various available formats itself?

If the former, then the uploading user might have provided up-scaled videos. But if it's the latter case, and that automatic process never up-scales, then it's true my original comment would not apply.

ddawson commented 4 years ago

And a follow-up question: Is it possible on YT for the uploading user to explicitly supply multiple source video files (corresponding to different YT formats), or does YT always generate those various available formats itself?

Uploading multiple sources is not something YouTube provides for. If it were, then people could just upload completely different videos for different resolutions, which would be a mess for a variety of reasons. And even setting this aside, YouTube's policy is to transcode everything you upload to it, so it has control of the bitrate that viewers get (otherwise, it might be too high to be watchable on some connections), so they can be sure they're delivering satisfactory streams, even if there's something wrong with the uploaded video, etc.

glenn-slayden commented 4 years ago

@ddawson thanks for that useful info.

danny-wu commented 3 years ago

Would it be possible to officially build in support for ignoring tbr, perhaps as a format option, or even just a site-specific CLI option like --youtube-ignore-tbr? There is precedent for this with the --youtube-skip-dash-manifest option.

I would like to ignore tbr for my youtube-dl archival systems, however I do not want to patch youtube-dl and have to keep updating the changes.