crf-search using worst sample only

alexheretic commented 4 months ago

Add a mode where we use the worst sample's VMAF after sample-encode instead of an average across all samples.

If we assumed that the worst quality sample VMAF score at a given crf would also be the worst sample for another crf we could speed up a crf-search by only doing N samples once then trying only the worst sample at different crf values.

Perhaps a --sample-aggregate=worst arg for sample-encode/crf-search/auto-encode.

allrobot commented 3 months ago

Good suggest. You can try PR or waiting for the author to add this feature..

veryprogram commented 3 months ago

Even better would be an auto-encode mode where the CRF is recalculated on either a frame-by-frame or scene-by-scene basis (default would be scene-by-scene but with an option for frame-by-frame) so that no part of the video drops below VMAF 95, but also no part of the video is allocated more bitrate than absolutely necessary to achieve VMAF 95.

But I'm not sure whether the resulting video chunks could then be concatenated without a second transcode. My knowledge doesn't go that far. 🤔

alexheretic commented 3 months ago

What you describe sounds more like what Av1an does. So check out that project.

ab-av1 is aimed at being fast by sampling to find a suitable crf and then purely relying on the encoder to provide consistent quality. This issue is about an option to make the sampling "crf-search" even faster at the cost of a more pessimistic estimate of the full resultant VMAF.

WhitePeter commented 2 months ago

Just passing by while evaluating ab-av1, which is an interesting (accidental) find, BTW. Thanks a lot for the effort!

TL;DR

I would like to offer my two cents and suggest a different approach than the OP's. While it is correct not to bother checking any samples other than the worst from a previous iteration, I think one could use the opportunity to reshuffle the deck: keep that worst sample and (randomly) find replacements for the others. It is not unlikely that there are even worse ones!

Reasoning

Seeing how VMAF came about (10s clips), I think one should take the threshold as the lower bound for anywhere in the video. This means that averaging over multiple samples should not be done. I would not want a final encode with that one scene which is 10 or more points below the set threshold just because there were enough samples with metrics far enough above it. That might just be the fly in the ointment which spoils my video experience; I would rather waste bits elsewhere. FWIW, my initial tests suggest that the final encode tends to be below the desired threshold by a significant margin, but, as I said, my stint so far has been very brief.

Anyway, since I want to find the lower bound it would be nice if ab-av1 would hunt more aggressively for more bad samples in following iterations.

alexheretic commented 2 months ago

The current sampling is trying to predict the avg overall VMAF, so it is appropriate to take samples and average them. This should be close to the final resultant VMAF and closer as you take more samples. So if you believe your input to be quite volatile you should take more samples. I don't think it is possible to tell automatically without computationally expensive analysis.

You can already configure a full pass encoding pass during crf-search, though this is quite slow. You could also combine that with harmonic mean VMAF (we should perhaps also try to support that better with sample aggregation too).

So this feature is just an additional option more about speeding up the search with a pessimistic result. If you wanted to search for more & worse samples, perhaps you can configure a lower --sample-every setting and combine with this proposed --sample-aggregate=worst. The first crf would analyse more 20s samples and then only the worst would count and other crfs would only encode that single sample.

I don't clearly see a better general strategy than encoding evenly distanced samples though.

WhitePeter commented 2 months ago

Yes and no. Speed would stay the same with my suggestion but the estimate would be more realistic since the approach is even more pessimistic: have not found the worst sample yet, so keep looking.

Maybe have a look at the issue I just opened. It contains some more reasoning about VMAF. In short: I think the idea that it can be used on an entire movie or TV show episode is flawed. The inherent averaging will hide the bad scenes where, locally, VMAF might be way lower (>10) than the overall average of the whole piece.

I am also looking for a suitable CRF but I want to avoid "killer scenes" where I might have to suffer bad compression artifacts. Those really spoil it for me. I want to upgrade my toolchain from x264, which basically uses one CRF fits all, to svt-av1-psy and came across ab-av1, thinking it might give me some more dynamic decisions. BTW, I have actually downgraded from x265 because it produced rare killer artifacts I just cannot tolerate. And I only realize when watching, after having spent that precious CPU time.

Anyway, I just stumbled in an saw an opportunity. If you disagree, that is perfectly fine with me. The original idea is a good one. I just thought that there is too much arbitration in selecting a sample that was chosen by a fixed interval that just happens to be the worst in the very small subset of the whole. A slight change in --sample-every might find the worst sample at a totally different location, at the other end even. But I won't press the issue any further. Consider these my last words on this matter. ;-)

But I maintain that averaging over multiple samples should not be done in any case.

WhitePeter commented 2 months ago

I am terribly sorry, but I just realized that I replied to the wrong comment https://github.com/alexheretic/ab-av1/issues/202#issuecomment-2179562420. Must have scrolled wrong. I do appreciate the actual reply to my comment very much and will digest your suggestions now.

alexheretic / ab-av1

crf-search using worst sample only #202

TL;DR

Reasoning