pinterf / mvtools

mvtools plugin for avisynth
155 stars 17 forks source link

MDegrainX introduces a chroma shift in 420 sources. #59

Open erazortt opened 1 year ago

erazortt commented 1 year ago

See the 3 screenshots link, you will need to zoom in and be able switch back and forth between the images to actually see that shift of about half a pixel. The shift is only in B. C matches A (apart from being degrained). I have checked with many files and all versions going back to 2.7.29 and even some before it. There is always a shift in B. The screenshot names match the following 3 scripts.

A: DGSource("file").UToY()

B:

DGSource("file")
clip=last
srch_super = clip.MSuper(pel=2)
bvec1  = srch_super.MAnalyse(isb=true,  delta=1, blksize=16, overlap=8, search = 4)
fvec1  = srch_super.MAnalyse(isb=false, delta=1, blksize=16, overlap=8, search = 4)
thSAD1 = int(9*8*8)
o_super = clip.MSuper(pel=2, levels=1)
clip.MDegrain1(o_super, bvec1, fvec1, thSAD=thSAD1)
UToY()

C:

DGSource("file")
z_ConvertFormat(pixel_type="YUV444P8",resample_filter="spline36",resample_filter_uv="spline36",use_props=3)
clip=last
srch_super = clip.MSuper(pel=2)
bvec1  = srch_super.MAnalyse(isb=true,  delta=1, blksize=16, overlap=8, search = 4)
fvec1  = srch_super.MAnalyse(isb=false, delta=1, blksize=16, overlap=8, search = 4)
thSAD1 = int(9*8*8)
o_super = clip.MSuper(pel=2, levels=1)
clip.MDegrain1(o_super, bvec1, fvec1, thSAD=thSAD1)
z_ConvertFormat(pixel_type="YUV420P8",resample_filter="spline36",resample_filter_uv="spline36",use_props=4)
UToY()
DTL2020 commented 1 year ago

In B and C looks you not need to make 2 MSuper if you not modify 'clip' (like pre-filtering for MAnalyse). Your first MSuper already create all levels required for MAnalyse and MDegrain. So you can save memory and get better performance if use only 1 MSuper() call. For pel=2 each MSuper create refined 4 planes even for levels=1. It takes memory transfer, CPU cycles and thrashes CPU caches.

About shift - it good to report pixel format (type) for B (returned by DGSource()). As I see in C you convert something to 4:4:4 8bit before mvtools. So you can place Info() after DGSource() and add pixel format for B.

B may be simple DGSource("file") super = MSuper(pel=2) bvec1 = super.MAnalyse(isb=true, delta=1, blksize=16, overlap=8, search = 4) fvec1 = super.MAnalyse(isb=false, delta=1, blksize=16, overlap=8, search = 4) thSAD1 = int(988) MDegrain1(super, bvec1, fvec1, thSAD=thSAD1) UToY()

And C too.

erazortt commented 1 year ago

Thanks I will look into the MSupers. Concerning the chroma shift, the pixel format is YV12(=YUV420P8) in this example. However the behaviour is all the same for all bit depths. I have also tried YUV420P16. The conversion in B is just convert it to 444 for the degrainig and then convert back to 420.

Dogway commented 1 year ago

I don't see a shift in a controlled test:

ColorBarsHD(width=1280, height=720, pixel_type="YV24")
ConverttoYUV420()
a=ExtractU()

clip=last
srch_super = clip.MSuper(pel=2)
bvec1  = srch_super.MAnalyse(isb=true,  delta=1, blksize=16, overlap=8, search = 4)
fvec1  = srch_super.MAnalyse(isb=false, delta=1, blksize=16, overlap=8, search = 4)
thSAD1 = int(9*8*8)
o_super = clip.MSuper(pel=2, levels=1)
clip.MDegrain1(o_super, bvec1, fvec1, thSAD=thSAD1)
ExtractU()
ex_makediff(a, metric="none", aug=true, dif=true, show=3) or Subtract(a) or Compare(a) or Expr(last,a,"x y - abs 50 *")

It might happen as a product of edge denoising by design but probably not as a bug (?)

erazortt commented 1 year ago

You must use real footage. I think there might be a connection to motion vectors.

pinterf commented 1 year ago

The processing assumes that the chroma location is "center". If your original clip has "left" chroma location, then try converting the input 420 clip to "center" before processing, and back to "left" after that. I wonder you are seeing the same difference.

erazortt commented 1 year ago

No, the chroma location itself cannot be the issue, because the problem is not consistent. Some patches are being shited other are not shifted. I uploaded a zip containing the pervious images, but also zoomed in versions, a 2 sec snipped of the clip and an avs script. The images are taken at frame 24. link

erazortt commented 1 year ago

I also made an animated image switching between B and C to show the issue: image There it can be seen that not all of the pixels are shifted. Especially the upper left part is shifted while the rest is not.

Dogway commented 1 year ago

No, the chroma location itself cannot be the issue, because the problem is not consistent. Some patches are being shited other are not shifted.

Maybe because the shifted edges are low local contrast areas, the static parts are either motion static, or failed match blocking or high contrast...

Maybe you can try increasing saturation (contrast for the chroma channels) to see if the effect diminishes to discard this theory.

DTL2020 commented 1 year ago

No, the chroma location itself cannot be the issue, because the problem is not consistent. Some patches are being shited other are not shifted.

Do disabling of 'truemotion' preset at MAnalyse change or fix it ? As I see from some documentation it is enabled by default.

erazortt commented 1 year ago

Well, chroma is always low saturated. But that is actually irrelevant because the motion vectors are calculated on the image, not for specific planes. MAnalze can take chroma into consideration but its output is only one set of vectors for all three planes. You could even disable the chroma completely for MAnalyze, and the resulting vectors could still be applied to MDegrain and the outcome is the same. So the morion vectors themselves cannot explain a shift, between luma and chroma, because it’s the same vectors which get applied. But perhaps however it’s the way MDgrain applies the motion vectors to subsampled chromas which is somehow imprecise.

Am 01.03.2023 um 23:21 schrieb Dogway @.***>:

 No, the chroma location itself cannot be the issue, because the problem is not consistent. Some patches are being shited other are not shifted.

Maybe because the shifted edges are low local contrast areas, the static parts are either motion static, or failed match blocking or high contrast...

Maybe you can try increasing saturation (contrast for the chroma channels) to see if the effect diminishes to discard this theory.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.

erazortt commented 1 year ago

So MAnalyse returns only one set of vectors for all planes. There are no separate vectors for each plane. So independently how it came to determining the motion vectors, this can never explain a shit between planes, since it’s the same vectors that are supposed to be applied on all planes.

Am 02.03.2023 um 00:36 schrieb DTL2020 @.***>:

 No, the chroma location itself cannot be the issue, because the problem is not consistent. Some patches are being shited other are not shifted.

Do disabling of 'truemotion' preset at MAnalyse change or fix it ? As I see from some documentation it is enabled by default.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.

Dogway commented 1 year ago

I understand now the premise. So the motion vectors for the chroma planes are not being shifted according to chroma location right? Maybe DTL2020 or pinterf have an idea where to look for this in source.

To test this since you can't disable luma motion vectors, try to set scaleCSAD to 2, so it takes more weight from chroma, if it looks better then it's a MV misalignment.

erazortt commented 1 year ago

There is no such thing as "motion vectors for the chroma planes". To make it more understandable, you can make MAnalyse disregard the chroma planes completly, and produce the motion vectors just based on the luma by setting chroma=false for MAnalyse. The resulting motion vectors can still be applied by MDegrain to all planes which the produces the same shift of the chroma planes relative to the luma plane.

So these scripts produce basically the same issue:


b=org ssb = b.MSuper(pel=2) bvb = ssb.Greyscale().MAnalyse(isb=true, delta=1, blksize=16, overlap=8, search=4) fvb = ssb.Greyscale().MAnalyse(isb=false, delta=1, blksize=16, overlap=8, search=4) thSAD1 = int(576) b=b.MDegrain1(ssb, bvb, fvb, thSAD=thSAD1).Subtitle("B")


and


b=org ssb = b.MSuper(pel=2) bvb = ssb.MAnalyse(isb=true, delta=1, blksize=16, overlap=8, search=4, chroma=false) fvb = ssb.MAnalyse(isb=false, delta=1, blksize=16, overlap=8, search=4, chroma=false) thSAD1 = int(576) b=b.MDegrain1(ssb, bvb, fvb, thSAD=thSAD1).Subtitle("B")


you can even remove the chroma component completely for the motion vector generation:


b=org ssb = b.MSuper(pel=2) bvb = ssb.Greyscale().MAnalyse(isb=true, delta=1, blksize=16, overlap=8, search=4, chroma=false) fvb = ssb.Greyscale().MAnalyse(isb=false, delta=1, blksize=16, overlap=8, search=4, chroma=false) thSAD1 = int(576) b=b.MDegrain1(ssb, bvb, fvb, thSAD=thSAD1).Subtitle("B")


All three versions will procude the same chroma shift after MDrgrain.

pinterf commented 1 year ago

UV degrain processes all motion vectors, and it's simply using a /2 (for 4:2:0) to get the source X,Y position on the chroma plane. This means that if you encounter left shift you must notice an upwards shift as well. https://github.com/pinterf/mvtools/blob/mvtools-pfmod/Sources/MVDegrain3.cpp#L1548

btw, what happens without overlap?

DTL2020 commented 1 year ago

"So MAnalyse returns only one set of vectors for all planes. "

Yes. But when you process in 4:2:0 the UV planes are twice smaller and the quantization error in placement is 2 times larger for blocks of UV planes. Position precision of MVs in 4:2:0 with pel=2 is 0.5 sample in Y plane and only 1 sample in UV plane (after decompressing to 4:4:4 RGB for display). So when GetBlock()/GetPointer() return sub-shifted block (with motion-compensated coordinate for MDegrain) it have different precisions for Y and UV in 4:2:0 and may cause this effect. MVs data is really integer (not float) - so with pel=2 it is +1bit for Y plane and 0.5 sample precision. For UV half sized planes it is 2 samples with pel=1 and +1bit is only full-sample precision. Try to see if setting pel=4 will help any. It will allow to have 1/2 sample precision at UV planes in 4:2:0.

So may be the sub-sample shifting with your pel=2 used can not track UV shift as precise as in 4:4:4 mode. It may be some version of the source of the issue.

The idea about truemotion - it cause large fields of coherent MVs so the quantization error is also equal for this area and you see it as global patch shift. If truemotion is false the MVs are more random and error is more random and less visible. Just an idea.

DTL2020 commented 1 year ago

If this UV for 4:2:0 quantization error is important and need to be reduced - may be add 1 more option to MSuper - to twice enlarge UV planes for pel=2 (equal to pel=4) and GetPointer() can request UV block shifted with better precision. Or in post-2.7.45 builds there exist 'runtime-sub-shifting' feature for MDegrain (actually feature of MVPlane and can be used in all mvtools)

https://github.com/DTL2020/mvtools/blob/7fdb122d05d7378386e315a558f5786f84499783/Sources/MVPlane.cpp#L588

and it can produce requested pel2 or pel4 granularity sub shift of 1x source plane with equal speed for any pel because uses single convolution process with different shifting kernels only. Though still very few of block sizes and bitdepth is implemented in SIMD for better performance. All other are covered with C-reference with lower performance. It also may be used to workaround the issue with 4:2:0 UV processing precision.

It also mean pel=8 for UV is required for 4:2:0 pel=4 processing with less quantization errors. pel=8 subshift can be also performed with same shifting engine with same time and only sub-8 shifting kernels need to be calculated. May be as simple averaging (bilinear interpolating) between pel=4 kernel points. Same as pel=4 intermediate kernel points designed from 1 and 0.5 shifts. MSuper currently not support pel=8 subplanes creation and it will take 64x more RAM if being implemented in pre-calculated from.

erazortt commented 1 year ago

Yes, pel=4 helps. It not exectly as good as 444 with pel=2 but its getting there.

However I do not want to process the luma also at that precission. So having pel=4 for chroma while pel=2 for chroma would be perfect.

DTL2020 commented 1 year ago

So we need new special mode of processing - for example multi-pel MSuper for colour clips (if input is not 4:4:4) and keeping track of different MVs to real sizes of planes mapping at requesting of motion compensated blocks for blending (without truncation of precision of MVs with pel=2 when requesting UV blocks). It may also applied to all members of mvtools with motion compensation like MCompensate. Also good to have higher-quality pel=8 mode for 4:2:0 UV with pel=4 MAnalyse.

In theory this quantization noise for MVs issue may also affect the performance on MAnalyse ? So when MAnalyse request UV blocks for SAD calculations in searching process (and MRecalculate too) the same UVs-refining need to be applied and may make better MVs ?

erazortt commented 1 year ago

~Wait I am not sure about that theory. Please take a look a the new version D:~

~d=org ssd = d.MSuper(pel=2) bvd = ssc.MAnalyse(isb=true, delta=1, blksize=16, overlap=8, search=4) fvd = ssc.MAnalyse(isb=false, delta=1, blksize=16, overlap=8, search=4) thSAD1 = int(988) d=d.z_ConvertFormat(chromaloc_op="left=>center",pixel_type="YUV444P8",resample_filter_uv="spline36") d=d.MDegrain1(ssc, bvd, fvd, thSAD=thSAD1).Subtitle("D") d=d.z_ConvertFormat(chromaloc_op="center=>left",pixel_type="YUV420P8",resample_filter_uv="spline36")~

~This produces the motion vectors like the version B on 420 but then applies them on a 444 clip. And the result is also perfect like version C, and does not have any chroma shift at all. Version B with pel=4 is better then version B with p=2 but its not as good as C or D! So I am now firmly sure that the issue is not the vectors, since B and D have the same vectors. But its the degraining itself. So pel=4 helps improving the symptoms but its not the cause! (I updated the zip file to include version D.)~

DTL2020 commented 1 year ago

For correct MDegrain you need the same colour formats for 'current' and 'other' frames (also same sub-pel refined planes). It is typically done by MSuper() with calculating of pel-refined planes for motion compensating with sub-sample precision. So you can feed 4:2:0 clip to MAnalyse for performance and 4:4:4 super clip for MDegrain to perform 4:4:4 motion compensating and blending with full-size full-precision planes of Y and UV. Also for a bit better performance you can set levels=1 for superclip for degrain.

Sort of super420 = ConvertTo420(last).MSuper(pel=2) super444 = ConvertTo444(last).MSuper(pel=2, levels=1) mvs420=MAnalyse(super420,...) MDegrain1(super444, mvs420, ...).Subtitle("D")

erazortt commented 1 year ago

~In my version D, even msuper is 420. The only thing being 444 is the clip d itself: d=d.MDegrain1(ssc, bvd, fvd, thSAD=thSAD1).Subtitle("D") And it still produces the same result as C.~

DTL2020 commented 1 year ago

Hmm - with 444 'src' clip and 420 super it should not work at all ? The src https://github.com/pinterf/mvtools/blob/d8bdff7e02c15a28dcc6e9ef2ebeaa9d16cc1f56/Sources/MDegrainN.cpp#L890 is 'current' frame in denoising blending and refs refined to sub-sample planes are taken from super clip.

I think it should throw stop-error https://github.com/pinterf/mvtools/blob/d8bdff7e02c15a28dcc6e9ef2ebeaa9d16cc1f56/Sources/MDegrainN.cpp#L618

"MDegrainN: source and super clip video format is different!"

Though it is about MDegrainN, may be MDegrain1 is not ? MDegrainX is the same check: https://github.com/pinterf/mvtools/blob/d8bdff7e02c15a28dcc6e9ef2ebeaa9d16cc1f56/Sources/MVDegrain3.cpp#L345

"MDegrain%d: source and super clip video format is different!"

Version D: d=org ssd = d.MSuper(pel=2) bvd = ssc.MAnalyse(isb=true, delta=1, blksize=16, overlap=8, search=4) fvd = ssc.MAnalyse(isb=false, delta=1, blksize=16, overlap=8, search=4) thSAD1 = int(988) d=d.z_ConvertFormat(chromaloc_op="left=>center",pixel_type="YUV444P8",resample_filter_uv="spline36") d=d.MDegrain1(ssc, bvd, fvd, thSAD=thSAD1).Subtitle("D")

provides 'ssc' to MDegrain1. Where is ssc defined ?

erazortt commented 1 year ago

Oh yes, you're right! It dows not work! super must be the same pixel type

DTL2020 commented 1 year ago

So do the workaround with feeding super of 444 to MDegrain solve the issue ?

erazortt commented 1 year ago

Yes that does solve the issue, but I would say that this is more of a circumvention than a solution.

Am 02.03.2023 um 15:54 schrieb DTL2020 @.***>:

 So do the workaround with feeding super of 444 to MDegrain solve the issue ?

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you authored the thread.

DTL2020 commented 1 year ago

For better solution some redesign of the internals of mvtools required. Also adding a new option to switch between current and new processing modes may be (to keep old scripts output unchanged and also to not lost performance). The script designers currently may add the option of 444 feed to MDegrain as higher-quality processing mode if for someone the old/vanilla standard mode is not enough. I will try to make tests if it helps to make MPEG-compression after degraining any better (also the used chromaresample filter for both going to 444 and back to 420 may put some difference to MPEG-compression later).

Currently it looks we have no 'external' implementation of the kernel used for resize in mvtools (Wiener). It have also some interesting properties for low overshoot and ringing at non-specially prepared sources while giving sharp enough results.

erazortt commented 1 year ago

Ok so for me the issue is resolved. I understood the problem and I understood the way around the problem. Please correct me if I'm wrong, but for me it seems that a redseign/reimplementation to solve the issue internally would not be faster than your suggested way around the problem using 444. Hence, I don't think it would be worth it to redesign anything. So I would now close the issue.

PS: I mean if you think it can be done faster, I don't want to hold you back of course :)

DTL2020 commented 1 year ago

Well - after more thinking it looks the old mvtools have a small design bug in motion compensation for subsampled formats.

In https://github.com/pinterf/mvtools/blob/d8bdff7e02c15a28dcc6e9ef2ebeaa9d16cc1f56/Sources/MVDegrain3.cpp#L1548

the integer division by 2 with shift cause correct decreasing size of x,y of MV but the 1 LSB of blx and bly coordinates is lost. And this lost bit is not used for fetching more correct sub-plane from refined UV planes set. Edit: "While MSuper for pel=2 also provides enough number of refined planes". - not valid.

So the bugfix looks is easy enough:

  1. Save LSB of blx, bly before integer division with something like bit AND: blx_LSB = blx & 0x1; bly_LSB = bly &1;
  2. Pass this data as an additional params to GetPointer() MVPlane function so it can be used in selecting more precisely the sub-plane required. The current number of subplanes from MSuper is enough so no additional recalculation required - only a small change in logic of selecting subplane using additional prescion bits at https://github.com/pinterf/mvtools/blob/d8bdff7e02c15a28dcc6e9ef2ebeaa9d16cc1f56/Sources/MVPlane.h#L152

where logic of selecting idx of the subplane to fetch must include new blx_LSB and bly_LSB data.

So the issue not need to be closed until better internal solution will be created.

erazortt commented 1 year ago

Oh that would be of course the best solution!

DTL2020 commented 1 year ago

Here is test build - https://drive.google.com/file/d/1DAX7grPursUSvB6R0gzk6uxVSR-4pNYI/view?usp=sharing

Work only for pel=2 and pel=1 and YV12 input. To use new chroma shift mode - add UseSubShift=1 to MDegrainN params (also only in MDegrainN). Also the MSuper may be used with pelrefine=fasle in this mode (to be a bit faster, though only with hardware search in MAnalyse, MAnalyse with UseSubShift=1 typically run much slower onCPU).

Currently used only runtime subshifting so may be slower in compare with MSuper refining but to use MSuper refining to +1 step much more redesign required.

Test if this solve the issue and its quality.

New function added to return sub-shifted chroma block: https://github.com/DTL2020/mvtools/blob/ba90ced487a42ba06ca14dee2522f424c115259e/Sources/MVPlane.cpp#L598

With additional logic to keep LSB of the division to fetch block from lower size chroma plane. Quick test and not all control paths are debugged so with YV24 it may crash.

erazortt commented 1 year ago

I've just tried it, however I'm not getting it to work:

d=org
ssd = d.MSuper(pel=2,pelrefine=false)
vd  = ssd.MAnalyse(isb=true, multi=true, delta=1)

This code fails with the error: "MAnalyse: super clip do not have refined plaes for pel>1 and not compatibe set of optins usd"

and with pel=1, pelrefine=false or pel=2, pelrefine=true it fails also at MAnalyse but by killing the whole process, so I don't see an error message.

I've tried all 4 dll's (W7 vs DX12 and SSE2 vs AVX2). org it the same as in the zip file.

DTL2020 commented 1 year ago

It is sadly sometime reported by users of my builds. I make only simple tests for my own encodings typically 1920x1080 frame size and 8bit YV12.

Can you share a small sample of your file and exact processing script for download so I can try to reproduce the crash ?

It is also one of the reasons to ask pinterf to port most useful and important features to his builds. Because after a long development in my builds may exist some hidden bugs causing crashes at some configurations and I have few time and experience to find them. Though pinterf looks like heavily busy with AVS core and also have a few time to look into many new features of the mvtools in pull-request.

"MAnalyse(isb=true, multi=true, delta=1)"

multi is typically used for MAnalyse with MDegrainN and you not need set 'isb' param. Though it should not be a reason to crash. Typical use cases either MAnalyse(isb=true, delta=1) - single delta frame mv_clip or tr=my_tr MAnalyse(multi=true, delta=tr) - special multi mv_clip for MDegrainN with all delta frames with +-tr (forward and backward).

And about pelrefine=false for MSuper - it is either for MSuper(pelrefine=false, pel=2)# (or pel=4. pel=1 do not use currently refine at all, but need to make +1 level of refined UV planes for subsampled chroma formats in the future even for pel=1). MAnalyse(optSearchOption=5) #hardware DX12-ME search do not use refined subplanes from MSuper or MAnalyse(UseSubShift=1) # use internal runtime subshifting to save some RAM (typically much slower if use onCPU search modes, may be used for optSearchOption=6 for onCPU SAD (or other dismetric) calculation) MDegrainN(UseSubShift=1)

all other configurations must use pelrefine=true (current default) for MSuper. With super clip generated with pelrefine=true any other filter can use UseSubShift=1 (where supported) if required runtime sub sample shifting.

I personally interested in fixing of the chroma blending issue because it can make quality of MPEG encodings better (lower spatial chroma MVs noise and lower encoder output bitrate). So it will be fixed somehow in the future. Down to the pel=4 granularity because it is the natural and the only output MVs precision of DX12-ME hardware search engines today. So require blending precision in UV panes in YV12 formats down to 1/8 sampling grid stepping.

DTL2020 commented 1 year ago

The issue with not best chroma subsampling partially degrages MVs search in MAnalyse too: a=org.Subtitle("A-org")

b=org ssb = b.MSuper(pel=2) ssb2 = b.ConvertToYUV444().MSuper(pel=2) mvsb2=ssb2.MAnalyse(multi=true, delta=tr, blksize=8, overlap=4, search=4) mvsb=ssb.MAnalyse(multi=true, delta=tr, blksize=8, overlap=4, search=4) thSAD1 = 576 bn=b.MDegrainN(ssb, mvsb, tr, thSAD=thSAD1, UseSubShift=1).Subtitle("BN-mvs420") bn2=b.MDegrainN(ssb, mvsb2, tr, thSAD=thSAD1, UseSubShift=1).Subtitle("BN-mvs444") Interleave(a.UToY(),bn.UToY(),bn2.UToY()) Using for review 400% magnification in VirtualDub.

There is still some shift at some blocks (areas) between bn and bn2 clips. The number of shifted blocks are much less in compare with main issue but still exist. Where bn is based on 420 MVs search and bn is based on 444 MVs search. Unfortunately current build still not fixes main chroma shift with blending in 420 - will make a new debug. May be it is required to to go single block debug at shifted area to see how its MV dx,dy is processed for subsampled block fetching.

DTL2020 commented 1 year ago

It looks I understand it - try that pre-release https://github.com/DTL2020/mvtools/releases/tag/r.2.7.45-fix1 . It is build from only fixed pinterf 2.7.45 sources so should be stable also. It not fixes completely (with current MSuper and its pel precision) but residual error should be now symmetrically distributed instead of single-sided twice more amplitude.

It looks general mvtools programming error with subsampled chroma formats: Unbiased integer division error having -0.5 mean error at the UV planes blocks requesting when size of UV planes < size of Y plane (and MVs base scale).

The fix in the pull-request - https://github.com/pinterf/mvtools/pull/60 . If testing is good - recommended for applying. It also fixes MAnalyse (may be MRecalculate too) and MDegrainX/N. All other mvtools better to be revised for the same issue. I.e. all GetPointer(x>>logxratioUV, y>>logyratioUV) calls to UV planes need to be fixed.

DTL2020 commented 1 year ago

Updated fix02 - https://github.com/DTL2020/mvtools/releases/tag/r.2.7.45-fix02 Found some error in MAnalyse for 4:2:2 and added chroma shift fix to MCompensate to use in QTGMC and other MC-denoise scripts based on MCompensate. As I understand this commit is auto-added to pull-request.

Also if merging this pull-request it is highly recommended to also port 'SuperCurrent' second optional input super-clip to MAnalyse - https://github.com/DTL2020/mvtools/blob/ba90ced487a42ba06ca14dee2522f424c115259e/Sources/MVAnalyse.cpp#L165 It is very simple addition and allow to use multi-generation MVs refinement with keeping unchanged source super clip and non-distorted reference. Its usage is simple in GetFrame - https://github.com/DTL2020/mvtools/blob/ba90ced487a42ba06ca14dee2522f424c115259e/Sources/MVAnalyse.cpp#L761

erazortt commented 1 year ago

fix02 works, and I can see that pixels are moved the other way, but its kind of hard to judge if the pixels are now nearer to where they should be. In fact it appears that that is now an overcorrection and just half the way would be perfect.

DTL2020 commented 1 year ago

"In fact it appears that that is now an overcorrection and just half the way would be perfect."

Yes - current fix only make error distribution symmetrical around best point. So maximum position error is now twice smaller. But to reach best point the more redesign of mvtools is required - the 2times more precision of super clip for UV planes (or 2more times precision runtime subshifting). So it only partial quick fix and not best possible solution. If it shows the progress to right way - some next version may finally have dual-pel super clip for subsampled formats (or with user-provided param for UV-plus_one_pel_level - easy enough for pel=1 and pel=2 and require addition of pel=8 processing for pel=4) and again redesigned GetPointer() function for finally accept full-precision blx. bly and Logx, Logy and making better precision fetching of block.

DTL2020 commented 1 year ago

Also possible workaround to increase spatial precision over current pel-limitations: Make enlarged version of input and process it with current available pel-precision. For example 2x enlarged source processed with pel=2 and after downsize back have processing precision equal to pel=4. Though performance in fps may be lower in compare with pel=4 processing in original size. But it can increase precision for currently not-implemented modes. For example pel=4 with YUV 4:2:0 still not have pel=8 for UV but it can be made with 2x upsized processing with pel=4.