fifonik / FFMetrics

Visualizes Video Quality Metrics (PSNR, SSIM & VMAF) calculated by ffmpeg.exe
562 stars 22 forks source link

Some odd results with lower than expected VMAF scores #108

Closed iPaulis closed 1 year ago

iPaulis commented 1 year ago

Hi, first of all, awesome tool, it is very useful.

I've been testing a bunch of files and calculating their vmaf scores, and I found some cases where the results were very far from correct, with scores way too low for very high quality encodings in x265 and av1 with crf17 for example.

I have been investigating why the scores were so low and this kind of problem has previously occurred to some people trying to calculate vmaf scores with inputs that are not perfectly in sync. In seems that after encoding a video the output could sometime be a frame off, and some other times, especially when changing the container, the timebase of each container could be different, so the frame timestamps would not match, even if the number of frames is the same. More info about that here.

With ffmpef or ffprobe we can check the timebase info of the files in the following values:

  • tbn = the time base in AVStream that has come from the container
  • tbc = the time base in AVCodecContext for the codec used for a particular stream
  • tbr = tbr is guessed from the video stream and is the value users want to see when they look for the video frame rate

When these kind of sync mismatches happen vmaf results in a wrong score.

This problem was reported in the vmaf github (examples, https://github.com/Netflix/vmaf/issues/611 and https://github.com/Netflix/vmaf/issues/629), so vmaf developers updated their documentation to address this issue. In the official docs they recommend adding -r 24 before each input to force set the framerate and PTS-STARTPTS to sync the presentation timestamps of the 2 videos.

The final command would be something like this: ffmpeg.exe -hide_banner -nostdin -r $framerate -i $distorted -r $framerate -i $reference -lavfi '[0:v]setpts=PTS-STARTPTS[main];[1:v]setpts=PTS-STARTPTS[ref];[main][ref]libvmaf=eof_action=endall:n_threads=15:pool=Mean:model=version=vmaf_v0.6.1' -f null -

And indeed, it works, or at least it fixed the vmaf scores of my tests with no regressions, while FFmetrics gives incorrect results.

From the FFmetrics log I've seen the syntax of the vmaf call, which is something like: ffmpeg.exe -hide_banner -nostdin -i $distorted -i $reference -lavfi [0:v]settb=AVTB,setpts=PTS-STARTPTS[main];[1:v]settb=AVTB,setpts=PTS-STARTPTS[ref];[main][ref]libvmaf=eof_action=endall:n_threads=15:log_fmt=json:log_path='$logpath':pool=Mean:model=version=vmaf_v0.6.1 -f null -

That vmaf call gives me incorrect results in many of my tests, but just by adding -r 24 before each input, as indicated in the docs, the issues get fixed and the resulting scores are correct.

We have been discussing this issue more in detail in this thread if you want more info: alexheretic/ab-av1#108 and the fix in alexheretic/ab-av1#115. There are many tests and evidence posted that you can check. We were experiencing this exact kind of issue and this is how we solved it, so that is why I wanted to let you know to help you fix your own tool.

Please, I would encourage you to make the ffmpef vmaf call as recommended in the official vmaf documentation to have more reliable vmaf results.

Thank you.

fifonik commented 1 year ago

At the very moment I'm investigating the fps/tbn/tbc/tbr stuff and trying to make ffmpeg calls more robust. This is not only applied to metrics calculation, but also to bad frames extraction.

Thanks for your feedback, I will check resources you mentioned.

P.S. I need to read more, but for now it looks like I will use -r <ref-tbr> (if tbr is not returned by ffmpeg then -r <ref-fps>) for both streams.

fifonik commented 1 year ago

Could you try version 1.4.5b that I just released?

In the version -r <framerate> is added before -i <src>. Also, you will be able to fine-tune the FFMpeg command for calculating metrics (if required) using FFMetrics.conf.

Thanks.

iPaulis commented 1 year ago

Yes, it works perfectly now.

Here is a screenshot of v1.3.1 and the VMAF graph: Captura de pantalla 2023-01-16 042233 VMAF

Here is a screenshot of v1.4.5b and the VMAF graph of the same files: Captura de pantalla 2023-02-13 003250 VMAF_v1 4 5b

The scores now match the ones I got for my test files.

Thank you.

fifonik commented 1 year ago

Thanks for checking.

I hope one day someone will be able to provide details on how to fix bad frames extraction :)

iPaulis commented 1 year ago

May I ask why do you use settb=AVTB in the ffmpeg call? is it better than default (intb) for some cases or does it fix something? I've also seen in stackexchange and some forums they recommended using 1/AVTB, but again, I don't know why a particular settb value would be better than another one.

I don't understand very well how this works.

fifonik commented 1 year ago

To be honest - I do not remember. I found this information years ago and use it since.