pinterf / TIVTC

TIVTC and TDeint
57 stars 9 forks source link

Add frame property "TDecimateCycleMetrics" #47

Closed flossy83 closed 7 months ago

flossy83 commented 8 months ago

I wanted to access TDecimate's cycle metrics in my .avsi script to see if there was more than 1 duplicate in the cycle as I wanted to tell if the duplicate resulted from an instantaneous field cadence break or whether the camera is just being held very still.

Test script for the pull request:

# make some framediff metrics using audio histogram
ColorBars().ConvertToYV12().Histogram(mode="stereo").KillAudio().Trim(0,900)

# overlay TDecimate's metrics
TDecimate(display=true, cycle=5)  # try changing cycle as well to confirm PROPAPPENDMODE_APPEND behaviour

# overlay retrieved frame properties
ScriptClip("PrintProps(last, current_frame)", after_frame=true, local=false)

function PrintProps(clip c, int current_frame){

    metrics = propGetAsArray(c, "TDecimateCycleMetrics")
    string = "\n\n\n"

    for (i=0, ArraySize(metrics)-1){
        string = string + "         --> " + String(metrics[i], "%0.2f") + "\n" }

    c.Text(string, lsp=0, align=7, size=20, bold=true)
}

Yellow numbers are the props retrieved in Avisynth and should match the numbers in white

debug avs_snapshot_00 00 000

flossy83 commented 8 months ago

There is a bug with my pull request that causes it to crash when using it in conjunction with multithreading and ScriptClip. To reproduce it:

ColorBars().ConvertToYV12().Histogram("stereo").KillAudio()
global clip1 = TDecimate(last, vidthresh=1.0, display=false).Prefetch(1)  # Prefetch(0) or display=true prevents the crash
global clip2 = TDecimate(last, vidthresh=2.0, display=false).Prefetch(1)  # Prefetch(0) or display=true prevents the crash
ScriptClip(clip1, """clip2""", after_frame=true, local=false)

If I remove the

if (has_at_least_v8) {
    AVSMap* props = env->getFramePropsRW(dst);
    for (int i = curr.cycleS; i < curr.cycleE; ++i)
        env->propSetFloat(props, PROP_TDecimateCycleMetrics, curr.diffMetricsN[i], i ? AVSPropAppendMode::PROPAPPENDMODE_APPEND : AVSPropAppendMode::PROPAPPENDMODE_REPLACE);
  }

Then it doesn't crash, so I must be doing something wrong in the way I added the frame props - any help appreciated.

pinterf commented 8 months ago

For the first check, it is done in the correct way, though I'd use propSetFloatArray after copying the data into an array of double. Anyway, please leave this code as-is, because I'd like to reproduce the crash, it's not normal.

flossy83 commented 8 months ago

Oops it seems I neglected to make the properties writable...after adding these lines it's no longer crashing for me

if (has_at_least_v9)
      env->MakePropertyWritable(&dst);
else
      env->MakeWritable(&dst);

Sorry!

pinterf commented 8 months ago

The previous change to TDecimate (Issue #43) used a very large cycle as an example. For instance, TDecimate(mode=0,cycleR=7779,cycle=38895,display=true,sdlim=-4) How do you evaluate the performance impact of adding so much data to a frame property? Either in present form (one-by-one) or changing it to call a single env->propSetFloatArray with a pre-filled double array?

flossy83 commented 8 months ago

The previous change to TDecimate (Issue #43) used a very large cycle as an example. For instance, TDecimate(mode=0,cycleR=7779,cycle=38895,display=true,sdlim=-4) How do you evaluate the performance impact of adding so much data to a frame property? Either in present form (one-by-one) or changing it to call a single env->propSetFloatArray with a pre-filled double array?

Sorry for the late reply, and yes I see the issue - a very large cycle value would almost certainly break something.

From what I can tell the second iteration of the for loop which populates the the array switches to PROPAPPENDMODE_APPEND which automatically converts the existing float prop to a floatarray in order to append the next value, then at the the next frame i=0 (false) so PROPAPPENDMODE_REPLACE replaces the entire array with a single float, and the whole thing repeats. Hopefully that shouldn't result in memory leak if PROPAPPENDMODE_REPLACE properly removes the previous array from memory - does it tho?

Off the top of my head... easiest solution is to just hard limit it to a maximum size, say, slightly below the maximum size of a prop (I presume there's some maximum size .. I looked here but couldn't find it).

Anyway, it's going to have to be limited to SOMETHING, so maybe...

// for (int i = curr.cycleS; i < curr.cycleE; ++i)
   for (int i = curr.cycleS; i < (curr.cycleE - curr.cycleS > limit ? curr.cycleS + limit : curr.cycleE); ++i)

What do you think?

flossy83 commented 8 months ago

Seems to work... for example with cycle=10 and limit=5:

debug avs_snapshot_00 00 000

I'm not sure what to set limit to though.

flossy83 commented 8 months ago

I did some benchmarks and it seemed like a limit value of 50-100 was optimal but after a few hours I realised my benchmarks were all wrong because the TDecimate(display=true) and Text() overlays were eating all the CPU at those crazy high cycle counts.

Also I realised my code is needlessly convoluted as the metrics can be put directly into the prop with just a single line env->propSetFloatArray(props, PROP_TDecimateCycleMetrics, curr.diffMetricsN, curr.length); (I had mistakenly thought the diffmetricsN array was indexed by frame number but it's just an array with length=cycle)

But if you still want to put a limit on the props then it's going to need to be more convoluted again. With cycle=48000 and overlays disabled there doesn't seem to be much performance difference, but it's difficult to tell as 48000 frames have to get processed in advance and that messes with ffmpeg's measurements. But the final "speed=" figure from ffmpeg at end of processing with cycle=48000 was 212x with no props and 151x with props (all text overlays disabled). I'll do some more testing...

flossy83 commented 8 months ago

Huh that's weird, if I do

ColorBars().ConvertToYV12().Histogram(mode="stereo").KillAudio()   # 1 hour clip
TDecimate(display=false, cycleR=8000, cycle=48000)

I only get a 17 second clip.

Other high values also create weirditude like a 0 second clip or crash, so I really don't think TDecimate it's supposed be used like that. I'm not sure why the user needs to set the cycle to such crazy high values, as long as the ratio between cycle and cycleR is the same it will decimate the same number of frames at lower values, so 8000 & 48000 is the same as 1 & 6. It sounds like they just want to be able to view all the framediffs on screen at once or something?

flossy83 commented 8 months ago

or changing it to call a single env->propSetFloatArray with a pre-filled double array?

Also I realised my code is needlessly convoluted as the metrics can be put directly into the prop with just a single line env->propSetFloatArray(props, PROP_TDecimateCycleMetrics, curr.diffMetricsN, curr.length)

Aaaaaand I just realised that's what you were getting at all along. And we didn't even have to prefill it. Talk about going on a wild goose chase - doh!

flossy83 commented 8 months ago

I needed to also know the frame number corresponding to each metric. I found it could be done in Avisynth script, but it relies on current_frame which may not necessarily be equal to what TDecimate sees as the current frame, for example when seeking the two will go out of sync temporarily, or if the source filter is not "frame accurate".

Have added frame numbers to properties in the most efficient way I could figure out. I'm noob to C++ so hopefully you could look over my changes and let me know it's not leaking memory.

Regarding performance, the plugin itself does many for loops where the number of loop iterations is equivalent to the cycle value, so running crazy cycle values like 48000 is going to set off a huge number of those for loops anyway.

flossy83 commented 8 months ago

Oh boy this is really turning into a nightmare. My last changes work as intended when doing PropShow() but crashes with propGetAsArray().

I'm able to fix it with the following changes but honestly I don't have a clue why it fixes the issue.

Coming from noob languages like AHK & python I found it really difficult just to create an array and reference it as a pointer properly. I can see Tritical is using (double *)malloc(sizeof(double)); to initialise an array whereas I'm doing new int64_t[n] . I'm only initialising it once when it's equal to nullptr so hopefully no memory leaks, I'm aware there is a delete thingy but I don't think it's needed if I only have initialised it once.

pinterf commented 8 months ago

To prevent accidental memory leaks usually std::vector<double> x; and std::vector<int64_t> y; (in these specific cases) are the recommended way instead of issuing new manually. The vectors (arrays) can be resized before your actual use, and will properly deallocated automatically. The actual array pointer e.g. double * can be passed to functions require a pointer, and even can be used for direct access. The pointer can be obtained by x.data() (I'm gonna have more time next week)

flossy83 commented 8 months ago

Ok cool. I was just reading up on that here but still trying to get my head around it.

Well, the array size will remain the same through the clip so maybe it's a bit overkill to use a vector?

But first I really need to learn how Avisynth plugins work so that I can find out where the "entry point" actually is to find the most optimal location to put a new array. But I think putting in the Cycle class makes sense cause the metrics are a property of the Cycle and that's where Tritical is putting all the other metrics too which keeps it consistent.

So to summarise my proposed changes as it's probably gotten obfuscated in all my messy incremental replies...

  1. Add the frame diff metrics. This one is easy cause Tritical already has an array curr.DiffMetricsN which can be dropped straight into the frame property with a single line: env->propSetFloatArray(props, PROP_TDecimateCycleMetrics, curr.diffMetricsN, curr.length); We didn't need to create or modify anything else, it just goes straight in, boom done.

  2. Add the frame numbers corresponding to the metrics above. This one is a bit more tricky. Ideally I would have combined them both into a single 2d array but that doesn't seem possible as Avisynth interpreter doesn't support 2d array frame props. So it has to go into it's own separate int array and be added as a separate frame property , which is a bit inelegant but ugh it's good enough as long as it's fast and reliable and doesn't crash those are the main things. We have to populate it manually in a for loop in GetFrame though, so this for loop will have to execute once per frame, adding a small amount of overhead for typical use case scenarios where cycle isn't set to some bonkers value like 48,000.

flossy83 commented 8 months ago

Alright so it might be "best practices" to just follow the way Tritical is creating and resetting his own arrays, i.e using malloc() and free() in the constructor/destructor of Cycle. And setting the array in the same for loop that Tritical uses to set metrics, if that helps performance a little bit. Here are the edits for that...

pinterf commented 8 months ago

O.K. keep the conventions, that part does not use vectors. mallocing in ctor and free in the desctructor.

flossy83 commented 8 months ago

It's a bit confusing cause the updates I just posted are showing code difference vs my own fork, so it's unclear what the updates are vs your official repo.

Do you want me to do a new Pull Request so it will show you only the differences between yours and mine?

flossy83 commented 8 months ago

Oh wait no I just realised in the "Files Changed" tab it shows only the diffs between yours and mine.

flossy83 commented 8 months ago

For completeness I understand now the entry point is TDecimate::TDecimate .

I was also confused about how the constructor was getting called cause there was no new Cycle but it seems even just defining a class variable such as doing just Class curr will call the constructor. And I can see he is using the memory address &current to refer to the instance that was passed to calcMetricCycle(). I watched some youtube videos on C++ the other day to learn some of this stuff, and the A.I Question tool is super helpful. I learned also the difference between a::b , a.b and a->b which was something I was always confused about.

What I'll do now is download my repository, compile it in VS and then try to break it with multithreading and scriptclips and such.

flossy83 commented 8 months ago

Well I reverted my fork to sync with yours as I gave up trying to copy Tritical's memory management pathways.

He's doing some complex stuff which I don't undertand, eg. shifting references from the current frame to the previous frame which ends up setting all the values in my array back to 0, if I try to copy exactly the same memory management pathways of his diffMetricsN array (such as malloc/free in the Cycle constructor/destructor).

I think it would actually be dangerous for me to assume that if I duplicate his code pathway for another array that it would all work out. I think it's safer if I just manage my own array as a separate entity that doesn't interact with any of Tritical's memory management stuff.

I'm not sure where to free my malloc'd memory for my array. It's not really necessary to free it since there is only 1 array being created once, but I'd still like to know where I could free it when the plugin shuts down. I couldn't find any documentation on that, eg. is there an "OnExit" type function that Avisynth calls?

pinterf commented 8 months ago

Deallocate in ~TDecimate. Thing like standard c++ classes are RAAI. Tritical's original code was heavily refactored, e.g. here, vector is used, and it will be freed automatically. In original code it was malloc'd in TDecimate::TDecimate, and was deallocated with free (or delete[] when was allocated with new) in ~TDecimate. Part of these changes were done here: dbe4b5a3.

flossy83 commented 8 months ago

Ok cool, I wasn't sure when and where ~TDecimate was going to be called as I didn't know if the plugin creates/destroys multiple TDecimate objects while its running. By the sounds of it it only creates a single TDecimate object on startup and destroys it once on shutdown, unlike the Cycle objects which seem to destruct at the end of every cycle of frames if I'm not mistaken.

flossy83 commented 8 months ago

I needed to get the metrics of the previous and next cycles as well in case the duplicate happens to land on the first/last frame of the cycle so I can compare it to both its neighbouring frames as part of determining how likely it is to be a real duplicate or just the camera is still.

The plugin already keeps Cycle objects prev curr next so the performance impact is negligible/nonexistent - tested with "c:\program files\ffmpeg\bin\ffmpeg.exe" -i "C:\debug.avs" -f null NUL with just ColorBars().ConvertToYV12().Histogram(mode="stereo").KillAudio().TDecimate()

1.0.28 Official orig

Frame properties build (Visual Studio 2019, Release, x64, v142 toolset) new

Debug script

# make some framediff metrics using audio histogram
ColorBars().ConvertToYV12().Histogram(mode="stereo").KillAudio() 

# overlay TDecimate's metrics
TDecimate(display=true, cycle=5) 

# overlay retrieved frame properties
ScriptClip("PrintProps(last, current_frame)", after_frame=true, local=false)

function PrintProps(clip c, int current_frame){

    metrics = propGetAsArray(c, "TDecimateCycleMetrics")
    fnums = propGetAsArray(c, "TDecimateCycleFrameNums")

    metricsP = propGetAsArray(c, "TDecimateCycleMetricsPrev")
    fnumsP = propGetAsArray(c, "TDecimateCycleFrameNumsPrev")

    metricsN = propGetAsArray(c, "TDecimateCycleMetricsNext")
    fnumsN = propGetAsArray(c, "TDecimateCycleFrameNumsNext")

    string = "\n\n\n"

    for (i=0, ArraySize(metrics)-1){        
        arrow = "                -> "
        curr = String(fnums[i]) + ": " + String(metrics[i], "%0.2f")
        prev = String(fnumsP[i]) + ": " + String(metricsP[i], "%0.2f")
        next = String(fnumsN[i]) + ": " + String(metricsN[i], "%0.2f")
        string = string + arrow + curr + ", " + prev + ", " + next + "\n"
    }

    string = string 
    \ + FillStr(StrLen(arrow)," ") + "Current"
    \ + FillStr(StrLen(curr)-5," ") + "Previous"
    \ + FillStr(StrLen(prev)-6," ") + "Next"

    c.Text(string, lsp=4, align=7, size=16, bold=true)
}

Resource check after 5 min playback

perf

Check first/last frame when prev/next frame doesn't exist (-20 as they should be)

begin end

edit: redid the ffmpeg perf test as it seems to get slightly faster after running it several times in a row - possibly windows RAM cache kicking in

pinterf commented 7 months ago

(Please allow me some additional time for the overview. I’m quite busy with office work, and I typically don’t turn on my PC during weekends)

flossy83 commented 7 months ago

Added the blend status (0/1 = true/false) of the current cycle as that can be useful info in certain situations.

pinterf commented 7 months ago

Well, the stormy skies have lifted from above my head, I'm back, thanks for the patience.

Could you put some words in the documentation as well? Perhaps you can put it somewhere here https://github.com/pinterf/TIVTC/blob/master/Doc_TIVTC/TDecimate%20-%20READ%20ME.txt#L844 With the necessary explanations, and maybe with an Avisynth usage example.

I'm gonna handle the file and the whole pack versioning (in the code - TDecimate and in the documentation) as well before the release. along with the change log.

flossy83 commented 7 months ago

Done, thanks

flossy83 commented 7 months ago

I wanted to do this as a separate pull request but couldn't figure out how....

TDecimate has this edge case option noblend which defaults to true which means if there are 2 dupes in the cycle and neither is next to a scenechange, it will NOT drop one of the dupes and replace the other with a 50-50 blend of its neighbours.

To my eyes a 50-50 blend isn't as good as it sounds - it doesn't hide the dupe at all really and the jerk is still very pronounced. To hide the jerk with blends needs around 8 blended frames in my tests, so the feature is pretty much useless and I just want to disable it globally.

Unfortunately you can't set noblend=true when hybrid=1 , only when hybrid=0, so this change allows it to work when hybrid=1 as well.

I've implemented it to retain backwards compatibility: when hybrid=1, noblend defaults to false, thus keeping the previous behaviour. Only if the user manually sets noblend to true, and hybrid=1, only then will the new behaviour will come into effect (the new behaviour being NOT doing a 50-50 blend when there are 2 dupes in the cycle and neither is next to a scenechange).

pinterf commented 7 months ago

The noblend behavior needs to be commented in the documentation as well.

flossy83 commented 7 months ago

Thanks I have done that here but can't get it to show up in this pull request thread, probably because it's closed

pinterf commented 7 months ago

Thank you, those changes were integrated, see comment on closing issue #48. 1.0.29 released for now.