pinterf / mvtools

mvtools plugin for avisynth
155 stars 17 forks source link

Frames out of order with Prefetch enabled #37

Open Boulder08 opened 4 years ago

Boulder08 commented 4 years ago

This is a rather strange issue and I don't know if MVTools, Avisynth+, AVS2YUV or x265 is the real culprit.. nevertheless, I was able to determine that MVTools can be used to trigger it so I'll submit the issue here.

Using this kind of a script, I get jerky movement in the beginning of the encoded clip where the Enterprise appears on the screen. This happens if Prefetch is used - if I comment it out, no problems. The defect is not consistent between encodes, for example these are two consecutive test encodes with the same script and encoder parameters:

encoded 198 frames in 37.80s (5.24 fps), 1742.24 kb/s, Avg QP:24.60 encoded 198 frames in 36.98s (5.35 fps), 1721.16 kb/s, Avg QP:24.59

dgsource("c:\x265\tng\tngjerk.dgi")
convertbits(16)
superanalyse = msuper(pel=2, sharp=2, rfilter=4, chroma=true)
supermdg = msuper(pel=2, levels=1, sharp=2, rfilter=4, chroma=true)
fv1 = manalyse(superanalyse, isb=false, delta=1, blksize=16, overlap=8, search=5, searchparam=16, pelsearch=8, truemotion=false)
bv1 = manalyse(superanalyse, isb=true, delta=1, blksize=16, overlap=8, search=5, searchparam=16, pelsearch=8, truemotion=false)
mdegrain1(supermdg, bv1, fv1, thsad=100, thsadc=100)
Prefetch(threads=24, frames=24)

The encoder command line is this: c:\x265\avs2yuv64.exe -no-mt -depth 16 "c:\x265\tng\tngjerk.avs" - | c:\x265\x265.exe -F 2 --input - --y4m --input-depth 16 --dither --sar 1:1 --profile main10 --ctu 32 --preset slower --merange 58 --crf 19 --output "c:\x265\tng\tng.hevc"

The cache settings are these: SetMemoryMax(20480) SetCacheMode(1)

I just tested removing truemotion=false from the script and the defect is gone. However, in my earlier tests this was also not very consistent and seemed to depend also on the x265 parameters.

https://drive.google.com/file/d/1e3c3Ysp80URNSQRQRABweoD41Sao3PQo/view?usp=sharing (original) https://drive.google.com/file/d/1OdNCGdB0gqtBEQWgmlbAW0ndDyg95ugG/view?usp=sharing (with Prefetch, jerky) https://drive.google.com/file/d/1NpToBeb7yFBjTPKS0-XR4V6zTT0ytxQm/view?usp=sharing (without Prefetch, no issues)

Boulder08 commented 4 years ago

Using RequestLinear(clim=100) after DGSource seems to fix the issue. I've had a similar issue a long time ago.. https://forum.doom9.org/showthread.php?p=1698102#post1698102

pinterf commented 4 years ago

I have experienced the different output as well, reported earlier. First of all, we suppose that the source filter is 100% frame accurate.

The differences I mentioned partially came from the internal MT (using avstp) when timing conditions may affect the result of certain mv search operations. But even with avstp disabled (removed or disabled by mt=false parameters) the phenomenon occured.

Then I remember a discussion on fft3dfilter_neo being non-deterministic in MT. (FFT3W library) But mvtools with these setting is not using that. (fft3w is used only with specific dct settings)

There are articles that floating point arithmetic can be non-deterministic due being processed in different orders in the processing pipeline. But it requires a deeper knowledge whether it applies here or not.

I have already spent many days with debugging on this topic but could not establish the reason other than the issue really exists.

Boulder08 commented 4 years ago

I'm wondering why the truemotion parameter seems to have such a big effect. It should merely change some values like a preset. I will check this also with the debug parameter of DGSource to see if it is delivering the frames in the right order and something else is then changing it or is DGSource going all wonky for some reason.

I myself have not noticed any non-deterministic issues with my x265 tests, the filesizes have remained constant with the same script and encoder parameters.

Boulder08 commented 4 years ago

I've verified that DGSource is frame accurate. What I also noticed is that during the encode process, both DGSource and Avisynth+ itself (using Info()) report the exact same frame number in the case where it actually is some frame from the past. So the actual frame data gets mixed up somewhere and not the order, if we want to be pedantic about it. The hard part is that it's completely non-deterministic so probably very hard to debug even if it occurs every time.

ravewulf commented 4 years ago

I remember having issues like this in the past when using MVTools, DGSource, and MT together and had a hunch it might be related to the video decode happening on the GPU. Does the issue appear if you use LSMASH or FFMS2? (also, avoid mpeg transport stream containers m2ts/ts/etc )

Boulder08 commented 4 years ago

Looks like it works with FFVideoSource.

So, should this be addressed in the DGDecodeNV end or is it Avisynth acting up? There is a workaround, but this issue could possibly affect quite a few users.

pinterf commented 4 years ago

I have hacked for myself a ShowCRC32 into Avisynth+ and they show that the frame content is the same. it's the motion vectors (for my test, I only use MSuper and MAnalyse for generating forward vectors (3 times) with prefetch(14) sometimes the three vectors which theoretically are the same (using MShow for visualizing) are not the same.

O.k. This is my magic script I was playing in June and now again. Replace the video with yours I'm sure you'll get into the problem. I am using VirtualDub2 frameno slider and step by step back and forth. Not always reproducible, play it go back go forward and if the lower part is not pure grey, there are differences between vectors.

SetMaxCPU("SSE2") # just use the minimum
#https://forum.doom9.org/showthread.php?p=1916393#post1916393
clip1=mpeg2source("The Weekend (feat Shena).d2v").Trim(0,1000)
clip1=clip1.Spline64Resize(1024,256) # this height is enough, better seen when stacked four of this
clip1=clip1.Crop(0,80,-0,-80) #(left,top,right,bottom)

#clip1 = clip1.ShowCRC32() # just to make sure that the original frames are the very same. (20201021 checked: they are the very same)

clip1

#overlaps=0 is O.K.
#search=3,4,5 blksize=16 overlaps=8 is bad
#blksize=32 overlaps=16 is bad, 4-2 as well, etc..
#blksize=32 overlaps=4 seems to be good as well (pure luck?)
blksize = 16
ov=8
super = MSuper (hpad=ov, vpad=ov, pel=1, levels=0, chroma=true, sharp=0, rfilter=1, mt=false)

#generate 3 identical forward vectors
search = 1
trymany=false
truemotion = false
meander = false
forward_vec1 = MAnalyse(super, blksize=blksize, search=search, isb=false, delta=1, overlap=ov, mt=false, chroma=false, trymany=trymany, truemotion = truemotion, meander = meander)
forward_vec2 = MAnalyse(super, blksize=blksize, search=search, isb=false, delta=1, overlap=ov, mt=false, chroma=false, trymany=trymany, truemotion = truemotion, meander = meander)
forward_vec3 = MAnalyse(super, blksize=blksize, search=search, isb=false, delta=1, overlap=ov, mt=false, chroma=false, trymany=trymany, truemotion = truemotion, meander = meander)

#visualize the motion vectors, then show the diff between them.
#theoretically the frames must be identical. 
showsad=true
version1=MShow(super, forward_vec1, showsad=showsad)
version2=MShow(super, forward_vec2, showsad=showsad)
version3=MShow(super, forward_vec3, showsad=showsad)

#StackVertical(old.ConvertBits(8), new.ConvertBits(8), Diff(old, new))#.ShowFrameNumber()
# Show motion vectors for 1 and 2, show difference between 1-2 and 1-3
StackVertical(version1, version2, Diff(version1, version2), Diff(version1, version3))#.ShowFrameNumber()

#PointResize(width*4,height*4) # view it larger

Function Diff(clip src1, clip src2)
{
  return Subtract(src1.ConvertBits(8),src2.ConvertBits(8)/*,stack=true*/).Levels(120, 1, 255-120, 0, 255, coring=false)#.ColorYUV(analyze=true)
}

Prefetch(14)
Boulder08 commented 4 years ago

I've switched the source line with my testclip and tried several times and in many points in the troublesome video, but the vectors for the clips keep perfectly in sync.

My sample script does cause a jump a few times when the Enterprise appears. Now I'm also able to replicate it without encoding, I couldn't do that when I tried earlier so it must not have been the exact same clip. I still feel that frame data and frame numbers are now getting mixed up somewhere. It does not occur in the exact same frames, but quite likely inside the same GOP.

pinterf commented 3 years ago

It took time. I think I have found the culprit. Never again such a bug. Pls. check 2.7.44

Boulder08 commented 3 years ago

Thank you very much for your effort, it must have been a pain (but a relief to actually find and fix it).

I'm having problems verifying the fix as I'm unable to reproduce the issue with the older version for some reason. Previously my testclip was a sure way to do that but now it's working perfectly with both versions.

pinterf commented 3 years ago

It's too random. A couple of frames in 1000 or nothing. Sometimes even Prefetch(100) worked; but in other cases I was able to encode two differently sized files in single threaded mode (MDegrain2). Anyway, with the new build all my trials were identical.

Boulder08 commented 3 years ago

Looks like I ran into this issue again :( Would you like to have a sample or should we just accept defeat (and use RequestLinear)?

pinterf commented 3 years ago

I'm sure it is a different thing you encountered now. To be sure, pls. provide the samples and instructions, in case I had time to look into it.

Boulder08 commented 3 years ago

This is the source file, when the Moon appears, it jumps quite severely in the encoded version. Just like in the earlier bug. https://drive.google.com/file/d/1PdDJbQTtF-tzvkLOXoroJlibJo1t3ZW4/view?usp=sharing

This is my custom function I used which causes the issue, requires several plugins I'm afraid. I was unable to simplify it to reproduce the issue but this one does it every time. It's probably unnecessary complex and could do with some cleanup :) https://drive.google.com/file/d/1L7Qc5Ipq5TF8_qvZxycTAvPSnXGGlaE1/view?usp=sharing

This simple script is enough to trigger it on my system: DGSource("C:\x265\test\who.dgi") MDG(limit=0.1) Prefetch(threads=24, frames=10)

You can use this command line to encode: c:\x265\avs2yuv64.exe -no-mt -depth 16 "path\file.avs" - | c:\x265\x265.exe --input - --y4m --input-depth 16 --dither --sar 1:1 --profile main10 --ctu 32 --preset slower --merange 58 --crf 19 --output "path\file.hevc"

pinterf commented 3 years ago

The earlier bug was an inconsistency: sometimes it behaved differently in different encodings.

Two issues presently:

I've got no DGSource, and nor I am familiar with .dgi extension, you provided and .mkv. I can use ffms2 but it won't possibly be frame accurate.

There is no function named 'cas'.

Boulder08 commented 3 years ago

CAS can be downloaded here: https://github.com/Asd-g/AviSynth-CAS/releases/tag/1.0.1

I didn't test yet, but ffms2 most likely works. I don't know if there is any method to debug this without DGSource :( However, if you have an nVidia GPU (quite old ones will do), I'm sure Donald Graft will grant you a free license as you've contributed so much to the community.

pinterf commented 3 years ago

Since I am not able to see any abnormal result, could you please create me your two encoding results: one with no pefetch, which does not exhibit the jump, and the heavy-multithreaded version which jumps? And I'd need an exact frame number (numbers) where you are seeing the jump, just to talk about the very same positions.

Boulder08 commented 3 years ago

Here's the clip which has jumps: https://drive.google.com/file/d/1diOvOqAA2FTxT1JOW3FtYI6i7tEHbC8Q/view?usp=sharing At least frames 116, 124, 131 and 140 are ones with the problem.

This one was run without Prefetch and has no jumps: https://drive.google.com/file/d/1oJgMy5d7IGy5OTA2EH_N0dLotDZ35sjd/view?usp=sharing

Using ffms2, no problems with Prefetch.

pinterf commented 3 years ago

Oops, yes, those are really serious jumps.

pinterf commented 3 years ago

I could not reproduce either using ffms2 or d2vsource. I'm encoding with x264. d2vsource: https://github.com/Asd-g/MPEG2DecPlus/releases/

Import("mdg.avsi")
d2vsource("who.d2v")
#DGSource("who.dgi")
MDG(limit=0.1)
Prefetch(threads=24, frames=10)
Boulder08 commented 3 years ago

Yeah, those jumps are quite bad.. I did some more testing with that script and function of mine.

MDG(limit=0.1, ls=true) will not cause jumps (the default being ls=false) Setting SetMaxCPU to "none", "mmx" or "sse" will not cause jumps even with ls=false. From "sse2" on, they start occurring.

pinterf commented 3 years ago

ls=true is using lsfmod instead of cas in mdg.avsi in sharp = ls ? clp.lsfmod(defaults="slow", strength=160) : clp.cas(sharpness=sstrength) cas has an opt parameter, you could play with it

- opt\
    Sets which cpu optimizations to use.\
    -1: Auto-detect.\
    0: Use C++ code.\
    1: Use SSE2 code.\
    2: Use AVX2 code.\
    3: Use AVX512 code.\
    Default: -1.
Boulder08 commented 3 years ago

Tried from 0-2, but they all cause jumps without the SetMaxCPU line.