fifonik / FFMetrics

Visualizes Video Quality Metrics (PSNR, SSIM & VMAF) calculated by ffmpeg.exe
607 stars 24 forks source link

Feature Request: Upscaling main videos on-the-fly, when nescessary #9

Closed e-d-n-a closed 3 years ago

e-d-n-a commented 3 years ago

I'm very confused!

I tested your latest release (v0.8.7) and noticed the errors, when trying to compare 720p and 480p files to a 1080p ref. After checking the log and also the closed issues #5 and #6, it's clear that FFMetrics still does not support this basic use-case, which is probably super-common!

Why is that? Are you planning to add some sort of auto-scaling feature to deal with different frame sizes in the future? At least you could show a message to the user, instead of them having to analyze the log for what happened. The "Media Info" clearly shows the differing resolutions, after files have been added, but it then just let's you continue and fail.

In #5 you state, that up-/downscaling of the test videos is out of the scope of this project, yet in #6 you mention not being aware at first, that ffmpeg doesn't (up-)scale automatically, while using libvmaf and also you would find it useful to add this feature in that case.

Well, it's not that hard to create an ffmpeg-command with a filter-graph, that upscales the test videos to ref size on-the-fly. See for example here.

Also imho the test videos would always be smaller, when they differ from ref, and should be upscaled for correct results, as Netflix states in their blog entry: https://netflixtechblog.com/vmaf-the-journey-continues-44b51ee9ed12

The upscaling-method should simulate the scaling, that occurs during playback of a low-res file, but usually "bicubic" can be used. So it would be great, if FFMetrics could also provide an interface/option for that, but it's optional.

Again, I think, that comparing files with smaller frame sizes to the ref is a really important use-case and Netflix also mentions, that it's better to lower the resolution (and also maybe the bitrate), instead of just lowering bitrate alone while keeping the same resolution to achieve better compression results!

Well, I could manually use ffmpeg to give me the vmaf-results for "my" use-case, but I liked the idea of a quick-to-use GUI-frontend that also shows all the metrics as graphs right away. Yet, in its current state FFmetrics is practically useless (at least for me)!

I tried altering the filter via "FFMetrics.conf", but I cannot use "json-models" for recent ffmpeg-build.

{
    "Metrics": {
        "VMAF": {
            "LavfiTemplate": "[0:v]setpts=PTS-STARTPTS,scale=2336:1080:flags=bicubic[main];[1:v]setpts=PTS-STARTPTS[ref];[main][ref]libvmaf='eof_action=endall:psnr=1:ssim=1:ms_ssim=1:phone_model={{phone_model}}:pool={{pool}}:model_path={{model_path}}:log_fmt={{log_fmt}}:log_path={{log_path}}'",
            "Models": {
                "vmaf_4k_v0.6.1.pkl": {
                    "height-min": 1081
                },
                "vmaf_v0.6.1.json": {
                    "default": true
                }
            }
        }
    }
}

Also the program config gets restored, but "FFMetrics.exe.config" is not changed and there is nothing in AppData or registry either! Where is the configuration stored?

Hrxn commented 3 years ago

Yes, apparently you are confused?

What are you talking about, which basic use-case? The whole point of VMAF is to automatically (i.e. at scale) assess encodes, that is transcodes made in comparison to a reference/source (which is either a complete feed of full frames straight from the sensor, which would be huge, or rather often some mezzanine format).

If I recall that correctly, the trained models VMAF uses are specifically made for 1080p and 4K. There are no other dimensions. Lower-res upscaled will always look worse, you don't need to research that?

Your linked Netflix Tech blog doesn't even mention any upscaling at all?

e-d-n-a commented 3 years ago

You're even more confused than me? You probably haven't read the article, have you!? Look under "Computing VMAF at the Right Resolution"! They call it "upsampling".

Well, Netflix invented VMAF and they use it that way, so I think it's a valid use-case!

Hrxn commented 3 years ago

No, just skimmed the article. You're right about that paragraph, I've must've missed this, Ctrl+F failed me apparently. This blog seems to be hosted on Medium, I blame their shitty site for this.

But my original point still stands. This is about evaluating codec A versus codec B, or to be more exact, codec A (with settings w or x) and codec B (with settings y and z), and so on, including fine-tuning and god knows what else.

This has nothing to do with content that gets upscaled at the end-user, because of a crappy device or bad Internet connection. They seem to have adjusted their models here (or added a new one), because on a mobile device (i.e. high dpi/ppi) the difference between 720p and 1080p is not as pronounced. I've could have told them this as well, how much do they pay their folks? Hey Netflix, contact me for my consultancy fees, please.

So, this is also a use-case, yes, fair enough, but I still don't see what kind of useful information you are trying to come up with here..

e-d-n-a commented 3 years ago

As Netflix created VMAF to optimize their encodings, it's use is to rate an encoded video in respect of its perceive quality with a fixed scale from 0-100, compared to the reference/original.

So you said it yourself: Compare any encoding with some settings to the reference. Then maybe compare multiple encoding options by their score, maybe even per frame or averaged sequences. Resolution is merely another setting here, that is used to compress a video and prepare it for a lower bitrate, as the video would have more artifacts and a worse score, when compressed with the same bitrate at its original resolution.

From the article:

For this episode, at 1000 kbps, is it better to encode with HD resolution, with some blockiness, or will SD look better?

It's just part of the analyzing process, that you have to upscale the encoding first before comparison with libvmaf. libvmaf doesn't do it for you, but ffmpeg can and so this simplifying tool should be able incorporate that easily!

e-d-n-a commented 3 years ago

With the correct ffmpeg-build using a libvmaf v1.x and the standard pkl-model, I am now able to compare videos of different resolution to a reference (PSNR, SSIM and VMAF) by using this custom "FFMetrics.conf":

{
    "Metrics": {
        "PSNR": {
            "LavfiTemplate": "[0:v][1:v]scale2ref=w=iw:h=ih:flags=bicubic[scaled],setpts=PTS-STARTPTS[ref];[scaled]setpts=PTS-STARTPTS[main];[main][ref]psnr='eof_action=endall:stats_file=-",
        },
        "SSIM": {
            "LavfiTemplate": "[0:v][1:v]scale2ref=w=iw:h=ih:flags=bicubic[scaled],setpts=PTS-STARTPTS[ref];[scaled]setpts=PTS-STARTPTS[main];[main][ref]ssim='eof_action=endall:stats_file=-"
        },
        "VMAF": {
            "LavfiTemplate": "[0:v][1:v]scale2ref=w=iw:h=ih:flags=bicubic[scaled],setpts=PTS-STARTPTS[ref];[scaled]setpts=PTS-STARTPTS[main];[main][ref]libvmaf='eof_action=endall:psnr=1:ssim=1:ms_ssim=1:phone_model={{phone_model}}:pool={{pool}}:model_path={{model_path}}:log_fmt={{log_fmt}}:log_path={{log_path}}'",
            "Models": {
                "vmaf_4k_v0.6.1.pkl": {
                    "height-min": 1081
                },
                "vmaf_v0.6.1.pkl": {
                    "default": true
                }
            }
        }
    }
}

It uses scale2ref to make it simple and generic!

combined_ss_cens

fifonik commented 3 years ago

At this stage up/down-scaling (and deinterlacing) is out of scope of the project. Why? The main aim of this project is comparing quality. However, up/down-scaling introduces HUGE quality changes by itself. So the results are quite useless in most cases (I know some situations when this can be useful, but this is definitely not your example when you are comparing videos downscaled to different resolutions and then encoded).

What exactly are you trying to measure in your example? "What would be better: downscaling do smaller resolution with higher bitrate versus downscaling to higher resolution with smaller bitrate" ?

I may change my mind in future, but for now this is it. Sorry.

e-d-n-a commented 3 years ago

I think this article describes a lot around the relationship between bitrate, resolution and quality/distortion!

For me an interesting result would be something like this: 0-sT1XzhC6HvAo3GX0

0-JOxSte08VHgwYWBP

0-jr59SeCl-6iQskzZ

It shows you, when choosing a bitrate at which point you have to switch to another resolution to achieve maximum quality/minimal distortion. And this is for fixed-QP encodings, so def. realistic. The major topic is about dynamic optimization to even improve on that, but that is still not clear and not the goal for me.

Although it's interesting to see how you can measure the performance of different settings for different shots of a sequence. That's what compression is all about now. Finding the right method and settings for the specific content, while minimizing distortion.

Well, we might still disagree on the value of comparing different resolution videos, but I thought the feature would allow the software to be more versatile without much effort needed. It can, for the most part, already be achieved with the right .conf-file.

Btw, I later found this Python package, that is a pretty similar wrapper for ffmpeg (compared to FFMetrics) to calculate quality metrics and it always does the scaling in the same way as I proposed!

See this part of the code (shows his use of scaling in the filter-graph, while calculating VMAF score with ffmpeg).

e-d-n-a commented 3 years ago

Also to your points:

However, up/down-scaling introduces HUGE quality changes by itself.

I think, this is not true. You probably lose some information by down-scaling, yes. Although a high-res video could already have low-no high frequency components/details. Compression and down-scaling are quite similar: they drop high-frequency components/details. While scaling does it uniformly, compression does it on a per-block basis, so the distoration afterwards will differ!

On the other hand, up-scaling for the most part preserves the information, while just adding redundant bits. I mean, you really think watching low-res on a TV or with a player/monitor-combo reduces the quality of the video? The video itself is already low-res, up-scaling doesn't add to that, it merely shows the differences in comparison with full resolution. So for analyzing the up-scaling doesn't change the results, it just makes the source feasible for comparison.

Also it is said here (see "Interpreting VMAF Score When Resolution Is Not 1080p"), what would be the difference when applying a model directly to lower resolutions. It's just like watching the same content/video at a greater distance from the screen! You just have to compare the signals at the same pixel-density to get the desired results. Signals are independent of the sampling as long as it is enough to hold all the information (frequency components).

What exactly are you trying to measure in your example?

I want to check, if different video formats have noticable differences in quality justifying the size difference.

I guess, many video platforms chose constant bitrates for certain resolutions, that don't depend on the content. With the right metric (VMAF) you should be able to check, if the bitrate is too high or too low for the content at certain resolutions.

If it's too high, you could re-encode the video with the right settings without losing quality, but saving on space. If it's too low, you could re-encode a higher resolution to the desired lower resolution with better quality. If the bitrate is accurate, you can choose to store the resolution, that fits the specific content the best.

A blurred, low-res source doesn't need high-res/bitrate. You will see that the high-res encodes won't have higher scores (unless they get sharpened)! A high-detail, high-res source cannot be compressed to low-bitrate/resolution without losing information. If you compare the result to the original/high-res version tho, you get a quantisation of the distortion.

You can spend times and effort trying to assess it visually, but this could make it automatable.

"What would be better: downscaling [to] smaller resolution with higher bitrate versus [upscaling] to higher resolution with smaller bitrate?"

First part, yes, as mentioned in an earlier post. Second part makes no sense though!

fifonik commented 3 years ago

OK As scaling distorted video to reference's video resolution is easy to implement, I will do this in the next version.

fifonik commented 3 years ago

Implemented in version 0.9.0