alexheretic / ab-av1

AV1 re-encoding using ffmpeg, svt-av1 & vmaf.
MIT License
483 stars 30 forks source link

VMAF values incorrect by 20 to 30 #63

Closed seanecoffey closed 1 year ago

seanecoffey commented 2 years ago

I am consistently getting VMAF results that are 20 to 30 less than if I ran VMAF or libvmaf directly on the same files. This is true for all of the commands, i.e., trying to use crf-search or auto-encode won't work correctly because it can't find the right vmaf (unless i set minimum <70).

Example results: ab-av1:

Z:\Files\Rips\Muxing>ab-av1 vmaf --reference "source_snip.mkv" --distorted "RF22_snip.mkv" --vmaf model="path=vmaf_v0.6.1.json"
(vmaf 29 fps, eta 0s)67.469444

libvmaf:

Z:\Files\Rips\Muxing>ffmpeg -i "RF22_snip.mkv" -i "source_snip.mkv" -lavfi libvmaf="n_threads=6" -f null -
[Parsed_libvmaf_0 @ 0000020d93a4bf80] VMAF score: 93.391178

Not sure if this might be a versioning thing or if I can resolve?

Running on windows vm at the moment with:

alexheretic commented 2 years ago

Try with the latest version as i fixed some previous vmaf issues where the videos were being converted to, in some cases, lower quality pixel format.

Another issue is, on Windows, only the reference gets converted to this not the distorted input. On Linux both are which i find had best results.

Not converting either, like your example, didn't always work well during my testing. The best approach would be to figure out how to use named pipe equivalent on Windows so it can work the same way. I don't really use Windows myself though. PRs welcome.

On Thu, 3 Nov 2022, 13:12 Sean Coffey, @.***> wrote:

I am consistently getting VMAF results that are 20 to 30 less than if I ran VMAF or libvmaf directly on the same files. This is true for all of the commands, i.e., trying to use crf-search or auto-encode won't work correctly because it can't find the right vmaf (unless i set minimum <70).

Example results: ab-av1:

Z:\Files\Rips\Muxing>ab-av1 vmaf --reference "source_snip.mkv" --distorted "RF22_snip.mkv" --vmaf model="path=vmaf_v0.6.1.json" (vmaf 29 fps, eta 0s)67.469444

libvmaf:

Z:\Files\Rips\Muxing>ffmpeg -i "RF22_snip.mkv" -i "source_snip.mkv" -lavfi libvmaf="n_threads=6" -f null - [Parsed_libvmaf_0 @ 0000020d93a4bf80] VMAF score: 93.391178

Not sure if this might be a versioning thing or if I can resolve?

Running on windows vm at the moment with:

  • ab-av1 0.4.3
  • ffmpeg 2022-10-30-git-ed5a438f05-full_build
  • vmaf 2.3.1
  • svt-av1 v1.3.0
  • opus 0.2-3-gf5f571b (libopus 1.3)

— Reply to this email directly, view it on GitHub https://github.com/alexheretic/ab-av1/issues/63, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARZHVYGQQDYT7TH2ZUVBPLWGO25HANCNFSM6AAAAAARWD45UM . You are receiving this because you are subscribed to this thread.Message ID: @.***>

iPaulis commented 2 years ago

I have an issue that might be related to this one, but I'm not sure, my VMAF scores do not seems to be always incorrect.

In my case, VMAF didn't give me inaccurate results when I tried with some of my files, but then I tried crf-search in a bunch of files with the -e libx265 option and I believe it is giving me inaccurate results with a difference around 20 points. However, the weird part is that I have not encountered this issue when I used crf-search with the default av1 encoder, instead of -e libx265. By default, all vmaf scores with the av1 encoder seemed to be fine and made sense.

This is the command I'm running: ab-av1 crf-search -e libx265 --min-vmaf $target --vmaf n_threads=16 --vmaf n_subsample=4 -i $referencedir --preset 6

(by the way, I'm not sure how ab-av1 is interpreting --preset 6 in this case for the x265 encoder, I don't know what is the equivalence, is it preset slow? or medium? maybe that part is ignored, in that case, what is the default preset for libx265? I run it like that because that is the preset I was using with the default av1 encoder, and now I was trying to compare the gains of svt-av1 vs x265 for the same target vmaf) And I'm having the following results for my files (13 files in total):

  • crf 32 VMAF 61.01 (6%)
  • crf 21 VMAF 70.28 (57%)
  • crf 10 VMAF 73.06 (289%)

  • crf 32 VMAF 65.49 (5%)
  • crf 21 VMAF 73.11 (65%)
  • crf 10 VMAF 75.40 (314%)

  • crf 32 VMAF 69.54 (8%)
  • crf 21 VMAF 77.54 (72%)
  • crf 10 VMAF 79.68 (303%)

And so on and so on, all of them unable to find a suitable crf for my target vmaf 92, which does not make much sense.

The reason why I suspect something is not right is that I already have a x265 encode of my own (with CRF23 in medium preset) of one of those files I was trying to search the crf for, with a result of 92.09827 according to ab-av1 vmaf, but now when I run crf-search with -e libx265 targeting vmaf 92, this is the specific result of that file:

  • crf 32 VMAF 64.63 (3%)
  • crf 21 VMAF 72.58 (67%)
  • crf 10 VMAF 75.15 (311%)

For that specific file I was expecting a crf-search result of around CRF23 (maybe more or less depending on the preset difference) to target vmaf 92. Where is the inaccuracy coming from? is it an vmaf issue? or could it be just a present diference issue?

I could test with a different present to confirm, if I learn how to effectively set that parameter in the command for -e libx265. I'm trying to aim for medium preset in x265 to make a fair comparison with the encoded x265 file I already have.

Edit: I'm running the latest 0.4.4 version of ab-av1 compiled from github.

alexheretic commented 2 years ago

(by the way, I'm not sure how ab-av1 is interpreting --preset 6 in this case for the x265 encoder, I don't know what is the equivalence, is it preset slow? or medium? maybe that part is ignored, in that case, what is the default preset for libx265? I run it like that because that is the preset I was using with the default av1 encoder, and now I was trying to compare the gains of svt-av1 vs x265 for the same target vmaf)

I don't know either, this is simply passed to ffmpeg so maybe it has a mapping of numbers to preset names or it just ignores it? Presets in general have different meanings for different encoders. Probably better to use the preset name with libx265.

In my case, VMAF didn't give me inaccurate results when I tried with some of my files

Can you tell me the resolution and pixel format of your source file? (You can run ffprobe on it to see). Also what OS are you using?

alexheretic commented 2 years ago

Looks like 6 => slow

alexheretic commented 2 years ago

@seanecoffey In addition to using the most recent release, can you also add your source video's resolution+pixel format?

iPaulis commented 2 years ago

All of these files share the same resolution and properties. They all are 1920x1080 8bit HEVC YUV 4:2:0 with around 12Mb/s bitrate.

Here is the complete video mediainfo of that source file in case it helps:

<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 4.0 Transitional//EN">

Vídeo

ID : 1
Formato : HEVC
Formato/Info : High Efficiency Video Coding
Formato del perfil : Main@L4@Main
ID códec : V_MPEGH/ISO/HEVC
Duración : 43 min 45 s
Tasa de bits : 11,6 Mb/s
Ancho : 1 920 píxeles
Alto : 1 080 píxeles
Relación de aspecto : 16:9
Modo velocidad fotogramas : Constante
Velocidad de fotogramas : 23,976 (24000/1001) FPS
Espacio de color : YUV
Submuestreo croma : 4:2:0
Profundidad bits : 8 bits
Bits/(píxel*fotograma) : 0.233
Tamaño de pista : 3,55 GiB (91%)
Librería de codificación : x265 3.2.1+1-b5c86a64bbbe:[Linux][GCC 7.4.0][64 bit] 8bit+10bit+12bit
Opciones de codificación : cpuid=1111039 / frame-threads=4 / wpp / no-pmode / no-pme / no-psnr / no-ssim / log-level=2 / input-csp=1 / input-res=1920x1080 / interlace=0 / total-frames=0 / level-idc=0 / high-tier=1 / uhd-bd=0 / ref=4 / no-allow-non-conformance / no-repeat-headers / annexb / no-aud / no-hrd / info / hash=0 / no-temporal-layers / open-gop / min-keyint=24 / keyint=240 / gop-lookahead=0 / bframes=4 / b-adapt=2 / b-pyramid / bframe-bias=0 / rc-lookahead=25 / lookahead-slices=4 / scenecut=40 / radl=0 / no-splice / no-intra-refresh / ctu=64 / min-cu-size=8 / rect / no-amp / max-tu-size=32 / tu-inter-depth=1 / tu-intra-depth=1 / limit-tu=0 / rdoq-level=2 / dynamic-rd=0.00 / no-ssim-rd / signhide / no-tskip / nr-intra=0 / nr-inter=0 / no-constrained-intra / strong-intra-smoothing / max-merge=3 / limit-refs=3 / limit-modes / me=3 / subme=3 / merange=57 / temporal-mvp / no-hme / weightp / no-weightb / no-analyze-src-pics / deblock=0:0 / no-sao / no-sao-non-deblock / rd=4 / selective-sao=0 / no-early-skip / no-rskip / no-fast-intra / no-tskip-fast / no-cu-lossless / no-b-intra / no-splitrd-skip / rdpenalty=0 / psy-rd=4.00 / psy-rdoq=10.00 / no-rd-refine / no-lossless / cbqpoffs=0 / crqpoffs=0 / rc=abr / bitrate=11600 / qcomp=0.60 / qpstep=1 / stats-write=0 / stats-read=2 / cplxblur=20.0 / qblur=0.5 / ipratio=1.10 / pbratio=1.00 / aq-mode=0 / aq-strength=0.00 / no-cutree / zone-count=0 / no-strict-cbr / qg-size=64 / rc-grain / qpmax=69 / qpmin=0 / const-vbv / sar=1 / overscan=0 / videoformat=5 / range=0 / colorprim=1 / transfer=1 / colormatrix=1 / chromaloc=0 / display-window=0 / cll=0,0 / min-luma=0 / max-luma=255 / log2-max-poc-lsb=8 / vui-timing-info / vui-hrd-info / slices=1 / no-opt-qp-pps / no-opt-ref-list-length-pps / no-multi-pass-opt-rps / scenecut-bias=0.05 / no-opt-cu-delta-qp / no-aq-motion / no-hdr / no-hdr-opt / no-dhdr10-opt / no-idr-recovery-sei / analysis-reuse-level=5 / scale-factor=0 / refine-intra=0 / refine-inter=0 / refine-mv=1 / refine-ctu-distortion=0 / no-limit-sao / ctu-info=0 / no-lowpass-dct / refine-analysis-type=0 / copy-pic=1 / max-ausize-factor=1.0 / no-dynamic-refine / no-single-sei / no-hevc-aq / no-svt / no-field / qp-adaptation-range=1.00
Default :
Forced : No
Rango de color : Limited
Colores primarios : BT.709
Características transferencia : BT.709
Coeficientes matriz : BT.709
iPaulis commented 2 years ago

I forgot to add I'm in Windows, latest version.

Also here is the info of ffprobe. ffprobe.txt

Then, assuming I was testing libx265 in preset 6, similar to slow, it should have been easier to get a higher vmaf score than my encoding in medium. I will retest with the preset name just in case.

alexheretic commented 2 years ago

1080p yuv420 should have no particular issues I was looking out for.

The reason why I suspect something is not right is that I already have a x265 encode of my own (with CRF23 in medium preset) of one of those files I was trying to search the crf for, with a result of 92.09827

Do you know how you encoded this x265 version previously? Did you use ffmpeg, args etc?

BankaiNoJutsu commented 2 years ago

Hey, I jump in, as I have a similar situation, also my libx265 encodes are consistently around 20 vmaf score lower than using av1. On various resolutions from 400p to 1080p.

Is it worth trying from a nix system to confirm it's a Windows issue due to the named pipe missing?

I'm not a developer myself, but would something like this help/be implementable? https://lib.rs/crates/named_pipe

On Mon, 14 Nov 2022, 18:00 Alex Butler, @.***> wrote:

1080p yuv420 should have no particular issues I was looking out for.

The reason why I suspect something is not right is that I already have a x265 encode of my own (with CRF23 in medium preset) of one of those files I was trying to search the crf for, with a result of 92.09827

Do you know how you encoded this x265 version previously? Did you use ffmpeg, args etc?

— Reply to this email directly, view it on GitHub https://github.com/alexheretic/ab-av1/issues/63#issuecomment-1314080201, or unsubscribe https://github.com/notifications/unsubscribe-auth/AARDH33EAGOPXUBIG4ZHC2LWIJVZHANCNFSM6AAAAAARWD45UM . You are receiving this because you are subscribed to this thread.Message ID: @.***>

alexheretic commented 2 years ago

Hey, I jump in, as I have a similar situation, also my libx265 encodes are consistently around 20 vmaf score lower than using av1.

Tbf av1 is a better codec than x265, so I would expect better results using it.

I posted about using encoders a little while ago and, at least for my example video, x265 crf 27 turned out to be equivalent to av1 crf 39 (using default presets) there.

Is it worth trying from a nix system to confirm it's a Windows issue due to the named pipe missing?

Yep Windows named pipe issue is the other thing. If, for a 1080p video, you notice direct ffmpeg usage is getting very different VMAF scores it could be because of this.

Related though simple ffmpeg usage is only good for 1080p videos, if you're using much smaller ones you'll get inflated VMAF scores. ab-av1 vmaf upscales these to ~1080p before running vmaf which leads to lower, more accurate scores using the default 1k model.

Is it worth trying from a nix system to confirm it's a Windows issue due to the named pipe missing?

Also if it's possible to provide me with a sample video to test that would be a big help.

BankaiNoJutsu commented 2 years ago

Also if it's possible to provide me with a sample video to test that would be a big help.

Hey, so i did a quick test with same settings on a windows and a nux machine, and indeed the results are different.

ab-av1 crf-search -i sample_1920x1080.mkv -e libx265

On Windows i get:

On Nux i get:

Test file: https://filesamples.com/samples/video/mkv/sample_1920x1080.mkv

Also, https://filesamples.com/samples/video/mkv/sample_1920x1080.avi, fails on both OS'es, on Nix with: Error: ffmpeg yuv4mpegpipe exit code Some(69)

alexheretic commented 2 years ago

Thanks I can indeed reproduce that. I think this needs the named-pipe issue to be fixed for Windows so both reference & distorted can be converted to yuv before passing to vmaf.

Also, https://filesamples.com/samples/video/mkv/sample_1920x1080.avi, fails on both OS'es

I don't seem to be able to dl that one.

BankaiNoJutsu commented 2 years ago

Thanks I can indeed reproduce that. I think this needs the named-pipe issue to be fixed for Windows so both reference & distorted can be converted to yuv before passing to vmaf.

Indeed, I don't know if the link i added previously would help on this. https://lib.rs/crates/named_pipe

Also, https://filesamples.com/samples/video/mkv/sample_1920x1080.avi, fails on both OS'es

I don't seem to be able to dl that one.

Sorry, link was wrong, https://filesamples.com/samples/video/avi/sample_1920x1080.avi

iPaulis commented 2 years ago

Do you know how you encoded this x265 version previously? Did you use ffmpeg, args etc?

No, that is the source file I downloaded. Anyway, this problem seems to happen with any video file.

I confirm the vmaf results with the same test file in Windows, they are exactly the same.

I also confirm that the presets in libx265 correspond to the x265 manual, and they work either using numbers or names, the results are equal and take the same encoding time depending on the selected preset level. So a difference is presets is discarded as the source of the issue, it must be a problem with the vmaf scores.

It makes me wonder if the av1 results are, although better, also not accurate.

.\sample_1920x1080.mkv

  • crf 32 VMAF 94.14 (94%)
  • crf 43 VMAF 86.73 (49%)
  • crf 35 VMAF 92.60 (79%)
  • crf 36 VMAF 91.95 (74%) crf 35 VMAF 92.60 predicted full encode size 28.91 MiB (79%) taking 70 seconds

Could you test that sample file in linux with av1 and share your results to see if they match mine in Windows? These are the params I used: ab-av1 crf-search --min-vmaf 92 -i '.\sample_1920x1080.mkv' --preset 6

alexheretic commented 2 years ago

Could you test that sample file in linux with av1 and share your results to see if they match mine in Windows? These are the params I used: ab-av1 crf-search --min-vmaf 92 -i '.\sample_1920x1080.mkv' --preset 6

On Linux, with both reference & distorted being converted to yuv444p10le it's around the same.

VMAF 92.12 predicted full encode size 28.91 MiB (79%)

VMAF seems to be quite sensitive to pixel format issues, which is why currently we convert both to yuv444p10le. There is a technical barrier to the distorted conversion on Windows, as named pipes don't work the same way and I develop on Linux.

I've written a workaround that may help on Windows (#64). It avoids specifying a pixel format for the reference yuv conversion, this seems to help in some scenarios.

Can you guys re-test using this branch?

cargo install --git https://github.com/alexheretic/ab-av1 --branch windows-vmaf-default-pix-fmt
alexheretic commented 2 years ago

Also, https://filesamples.com/samples/video/mkv/sample_1920x1080.avi, fails on both OS'es

It does work for av1. This is a separate bug when encoding --encoder .avi samples, it would use the same extension which doesn't work for x265, or generally. Fix in #66

iPaulis commented 2 years ago

I tested the new branch, but unfortunately I got the same vmaf results, without changes.

BankaiNoJutsu commented 2 years ago

Thanks, I will try it out a bit later. In the meantime I stumbled upon this crate: https://github.com/kotauskas/interprocess

Would this allow easy named pipe implementation for Windows too?

alexheretic commented 2 years ago

Thanks yeah we can follow this up in #65 I think part of the confusion is Windows does have named pipes but they are a different thing to unix named pipes which I need. Perhaps unnamed pipes could be used from the crate you linked? I'll look into it.

alexheretic commented 1 year ago

I've implemented Windows named pipes for vmaf distored yuv conversion #67. This should be much more consistent with Linux. Though Windows vs Linux probably won't be exactly the same, as they'll probably have different encoder & vmaf builds etc.

Can you guys re-test with this branch and see if it helps?

iPaulis commented 1 year ago

I'm testing the new branch and it seems very promising, there is a notable difference in the vmaf results in Windows.

ab-av1 crf-search -e libx265 --min-vmaf 92 -i '.\sample_1920x1080.mkv' --preset 5

Previous results (main branch) of the test file:

  • crf 32 VMAF 62.62 (36%)
  • crf 21 VMAF 74.56 (139%)

New results (named-pipe branch) of the test file:

  • crf 32 VMAF 76.78 (36%)
  • crf 21 VMAF 95.69 (139%)
  • crf 23 VMAF 93.55 (113%)
  • crf 24 VMAF 92.25 (102%)
  • crf 25 VMAF 90.81 (91%)

It does not match the results @BankaiNoJutsu got on Nux, but it surely makes a lot more sense now for the x265 encoder. What results to do you get now with the new branch in Linux for this test file? are they close enough to windows?

I also tested with the default av1 encoder, but in that case the results were exactly the same as before, which I guess it's correct, as the av1 results were already similar to linux anyway.

Besides, now testing crf-search against the source file I had encoded myself with x265 (vmaf92 and crf23) it is giving me a result close to my own encode, so I believe the issue is fixed.

alexheretic commented 1 year ago

Thanks @iPaulis. My testing is also showing much more consistent results now.

iPaulis commented 1 year ago

Sorry to reopen this issue, but I think something similar might be happening with the vmaf values of .m2ts files; they seem incorrect by 35-40 points when using crf-search with the libx265 encoder at least in windows, I don't know if there would be a difference in Linux.

I also tried using the default av1 encoder with the same .m2ts file and the vmaf results seem to be correct in this case. I initially tested this with a full bluray source in windows (1920x1080 8bit VC1 YUV 4:2:0). These are the commands I run and their results:

Those vmaf x265 results must definetely be inaccurate, right? Besides, as the results were weird, just in case, I changed the video container from .m2ts to .mkv without touching the video quality with MKVtoolnix and rerun the tests, here are the results:

So, as you can see, there is a huge difference in the vmaf results for the x265 encoder just by changing the video container, which shouldn't be happening.

alexheretic commented 1 year ago

I think this is a separate issue about m2ts.

ffmpeg samples & final output use the same .m2ts extension which probably isn't the best thing to do. i wonder if that's the cause. I think we can change the samples to use the same extension as the output (if they don't already) and default that to something like .mp4 in this case.

iPaulis commented 1 year ago

I also downloaded and tried to reproduced the same vmaf inconsistency with one of those online sample files, the same one we used to test this issue previously, now in .m2ts container, so anyone could test it too in their system. The problem is that in this sample file the source video format is worse (MPEG-2) and has less quality, and that might be the reason why the vmaf results are so weird, even trying with av1, so the test is not exactly reproducible with this sample file. However, the results are also weird on their own:

These results may show a different kind of problem, so I would recommend to test this issue with another sample.m2ts file, like I did from a full bluray source, or we will need to find a suitable one that we can share to test and confirm it. I'll try to see if I find one.