Closed chainikdn closed 4 years ago
Thanks. Can you provide log?
T 24448 @ 18732: Start processing output frame 72 at 101553125 ~ 101758125 frameTime 205000 for source 36 Output queue size 1 Front 73 Back 73
T 19100 @ 18732: Get source frame: frameNb 37 Input queue size 3 Front 35 Back 37
T 9828 @ 18732: DBG: GetSourceFrame 38 <-- before srcLock(_sourceFramesMutex)
T 9828 @ 18732: DBG: .... GetSourceFrame ... waiting for .... 38 <-- before _sourceFrameAvailCv.wait(srcLock);
T 10816 @ 18734: DBG 1 <-- entering Receive with _reloadAvsSource=true
T 10816 @ 18734: Frame handler begin flush
T 10816 @ 18735: Frame handler wait for barriers
T 9828 @ 18735: Drain for frame: 38
T 26572 @ 18735: Output sample worker thread 26572 arrive at barrier
T 26500 @ 18735: Input sample worker thread 26500 arrive at barrier
T 25772 @ 18735: DBG: GetSourceFrame 39
T 25772 @ 18735: Drain for frame: 39
<..>
T 24448 @ 18736: Output sample worker thread 24448 arrive at barrier
T 20228 @ 18737: DBG: GetSourceFrame 41
T 20228 @ 18737: Drain for frame: 41
<...>
T 17912 @ 18767: DBG: GetSourceFrame 50
T 17912 @ 18767: Drain for frame: 50
T 23776 @ 18919: Output sample worker thread 23776 arrive at barrier
T 10816 @ 18919: Frame handler done synchronization
T 25108 @ 18919: DBG: GetSourceFrame 51 <--- after EndFlush, _isFlushing=false
T 25108 @ 18919: DBG: .... GetSourceFrame ... waiting for .... 51
T 10816 @ 19343: DBG <-- before _avsScriptClip = invokeResult.AsClip();
avisynth_filter_64.ax!AvsFilter::CAviSynthFilter::ReloadAviSynthScript(const _AMMediaType & mediaType) Line 834 C++
I don't find any recent commit that has _avsScriptClip = invokeResult.AsClip();
at line 834. But I get the problem. The issue is most likely in AviSynth. Every time we start a new script, if there is a prefetcher used in the script, avs env will try to destroy it first, which causes the hang. Previous we were using Avs 3.6+, which allows multiple prefetcher in the env, so we don't have to destroy the current one before restarting the script. But now we are on 3.5, we have to destroy. And all of these are not related to worker thread at all. I had headache with this since beginning. We could solve this by completely destroying avs env, but it will introduce performance penalty.
I could send you a version on top of Avs 3.6, could you test it for me?
But I get the problem. The issue is most likely in AviSynth. Every time we start a new script, if there is a prefetcher used in the script, avs env will try to destroy it first, which causes the hang.
mmm... nope. the problem is AVS clip is still requesting frames via GetSourceFrame() at the point where your condition variable _sourceFrameAvailCv will never trigger. you probaly just need to move _frameHandler.EndFlush() after the ReloadAviSynthScript() call
you probaly just need to move _frameHandler.EndFlush() after the ReloadAviSynthScript() call
You mean like https://github.com/CrendKing/avisynth_filter/commit/32d0262eec36daedaede26ae35c54c77f96e9295#diff-577e0566b5887b0ba83e859a83f0140eL109? I used to do it for draining in GetSourceFrame(), but removed once I no longer get frame from buffer. I've never seen this problem any more, but if it fix your problem I can add it back.
probably... the whole issue is _isFlushing is reverted to false before avs clip destruction something like
_frameHandler.BeginFlush();
ReloadAviSynthScript(m_pInput->CurrentMediaType());
_frameHandler.EndFlush();
fixes this
Try https://github.com/CrendKing/avisynth_filter/commit/0ce598a9348854b272783255aaaa1e2a45e4bd38
BTW, I changed how log file is produced. It's no longer controlled by a sneaky preprocessor, but by a registry value "LogFile". If the value is not empty, it contains the file path. This way even a release build can generate log on demand.
Or even https://github.com/CrendKing/avisynth_filter/commit/f468764a4f11f819eb4d17f9b91fba7552c4f6dd. There is double flush in previous commit. Since most of the time BeginFlush and EndFlush comes together, why not consolidate them?
why "if (!_avsEnvAtLeastV8)" ? it still hangs with 3.6, obviously
plus now it crashes at video player closing :D because you call ReloadAviSynthScript() when _avsEnv is already null
After some thinking, I'm considering abandoning the "multiple worker threads" idea due to the serialized nature of how the frames work. Let me document my reasoning for future reference.
The general workflow of the transform filter is this:
outFrame[n].startTime === outFrame[n - 1].stopTime === outFrame[n - 1].startTime + outFrame[n - 1].duration
. There should be no gap in between. The duration of each frame could vary for VFR (variable frame rate) videos.When a renderer filter receives a sample, it schedules rendering based on the time stamp. If the sample arrives late, or has no time stamp, the filter renders the sample immediately. Otherwise, the filter waits until the sample's start time before it renders the sample."
most renderer can't deal with out-of-order delivery, regardless of how much ahead of time the sample is delivered. For example, EVR always renders samples as they arrive. EVR Sync drops out-of-order samples. And madVR just freezes.
Because of #2, all input threads must be synchronized to calculate the correct frame times. Because of #3, the delivery procedure must be synchronized across all output threads. And too much synchronization would mostly defeat the benefit of scalability of multiple worker threads design.
On the other hand, I'm still thinking having one input worker thread + one output worker thread + COutputQueue is beneficial, thanks to the flexibility to adapt to quality changes mentioned in #1.
EDIT: Look like madVR is not compatible with COutputQueue. It just freezes.
I'd say that input threads are not necessary at all, single threaded input in the main thread is ok. But the output frames calculations (i.e. "waiting for the Avisynth") and copying/interleaving must be done in a separate thread. Is it always enough to have one "output" threads, or some cases / system configurations will need several ones - I dunno, need more testing... You told you clearly see huge perf. improvement - when, where, how?
BTW, not related at all - could you add a non-ssse3 path for interleaving? a few people still have Athlon/Phenom CPUs... I'm obviously not talking about 4k playback, but at least it must not crash with "illegal instruction", so plain C implementation would be ok.
will need several ones
If say one input sample generates 7-8 output samples, and the transformation is resource intensive, having multiple output workers could scale, but again, they all have to synchronize on delivery. I could make the number of output threads configurable, and default to 1.
when, where, how?
If I play a 4K video, set SVP block size to 16 or even 8, having one thread (0.6.1) is guaranteed to be choppy. Having worker thread could deliver smooth playback. But this is extreme corner case. I don't think we should optimize the whole logic for extremity.
Athlon/Phenom CPUs
Seriously? How do they even use SVP efficiently? Can't you ask them to upgrade hardware? There are still around 1.5% Windows XP users out there. Should we also support Windows XP?
How do they even use SVP efficiently?
why not, phenom x6 used to be a good CPU, totally enough for 1080p...
Should we also support Windows XP?
this is different case. and no, we shouldn't :)
If I play a 4K video, set SVP block size to 16 or even 8, having one thread (0.6.1) is guaranteed to be choppy.
there may be another reason for this... the "master" branch doesn't wait in GetFrame() in case the requested frame is not here yet, it returns the last frame instead
Please try https://github.com/CrendKing/avisynth_filter/commit/c4f05028804e1c22b764bb96fa4cfac0e044e978.
there may be another reason for this... the "master" branch doesn't wait in GetFrame() in case the requested frame is not here yet, it returns the last frame instead
It is one of the reason why I started the worker thread design, but I know the difference of performance choppy and the "last frame" choppy.
could you add a non-ssse3 path for interleaving?
Please create separate ticket for tracking purpose.
but I know the difference of performance choppy and the "last frame" choppy
I just can't imagine how that 8-vs-16-blocks choppy could depend on threading in AVSF. It's weird...
Multi threading scales well for IO intensive tasks. It depends on many factors, including hardware, renderer and source. Say if the renderer already takes long time to receive samples, putting extra milliseconds on computing the flow vectors on the same thread would intensify the choppy symptom. On the other hand, if the computation is happening on a different thread, a blocking Deliver() thread would not impose much penalty since CPU is fully utilized.
If user's setup has very little latency on the IO side, then the task becomes CPU intensive. Multi threading usually would not help since it should already fully saturate CPU (assuming AVS/SVP themselves has enough threads to saturate).
putting extra milliseconds on computing the flow vectors on the same thread would intensify the choppy symptom
Wrong, Avisynth runs its own threads (many of them, SVP usually sets 25 threads), it doesn't connected with AVSF in any way except (!) for requesting input frames. And all those calculations are done in advance via Prefetcher.
Like I said, it depends on many factors. I have my own test scripts, which varies by having and having no prefetcher, varying thread count for AVS and SVP. If you are talking about the curated default script of SVP, then you could be right.
So, when you're talking about "huge improvements" you actually ran SVP's script without prefetcher? o_O
With prefetcher, but far fewer than 25 threads. I do test without prefetcher though.
So, to summarize what I personally want/need regarding AVSF performance. I just want to be sure that AVSF doesn't cap output performance at some "low" value like 100 fps for 4K/10-bit. There're a lot of possible bottlenecks in other places though, but lets make sure it's not AVSF's fault.
cap output performance at some "low" value like 100 fps for 4K/10-bit
Let's avoid vague statement like this. If you want particular performance watermark, then we need to develop repeatable and measurable benchmark. We need a fixed set of videos, fixed set of scripts, video player and upstream/downstream filters. Measure delivery FPS from AVSF and record it for each version. This way if there is degradation we can pinpoint the culprit. How's that sound?
__debugbreak triggers on every script reloading
we need to develop repeatable and measurable benchmark
a simple, non-SVP script, probably just resizing to a large value like 8k*8k, plus null renderer (??), plus a few perf. counters inside AVSF
tested MPC-HC + EVR-CP with maximum relaxed resizer settings - (GTX 2070, Ryzen 7) simple script, varying frame size and frame rate values:
SetFilterMTMode("DEFAULT_MT_MODE",2)
SetFilterMTMode("AvsFilterSource",3)
AvsFilterSource()
PointResize(4096,4096)
ChangeFps(240)
Prefetch(10)
so... single threaded "master" gives stable ~210 fps for 4k4k 8b frame. which is totally enough for now and in the next 10 years. hmmm. regarding "worker": it shows ~275 fps but with a huge "Avg sync offset" (between 100 and 900 ms), plus it hangs sometimes
much heavier script:
ChangeFps(240)
LanczosResize(4096,4096)
gives ~160 fps with both versions and 100% CPU load
overall, it looks like single-threaded AVSF is NOT a bottleneck at least until ~200 fps and it makes me feel a bit confusing... :D
__debugbreak triggers on every script reloading
we need to develop repeatable and measurable benchmark
a simple, non-SVP script, probably just resizing to a large value like 8k*8k, plus null renderer (??), plus a few perf. counters inside AVSF
tested MPC-HC + EVR-CP with maximum relaxed resizer settings - (GTX 2070, Ryzen 7) simple script, varying frame size and frame rate values:
SetFilterMTMode("DEFAULT_MT_MODE",2) SetFilterMTMode("AvsFilterSource",3) AvsFilterSource() PointResize(4096,4096) ChangeFps(240) Prefetch(10)
so... single threaded "master" gives stable ~210 fps for 4k4k 8b frame. which is totally enough for now and in the next 10 years. hmmm. regarding "worker": it shows ~275 fps but with a huge "Avg sync offset" (between 100 and 900 ms), plus it hangs sometimes
much heavier script:
ChangeFps(240) LanczosResize(4096,4096)
gives ~160 fps with both versions and 100% CPU load
overall, it looks like single-threaded AVSF is NOT a bottleneck at least until ~200 fps and it makes me feel a bit confusing... :D
SVP 4K has choppy playblack experience. AMD Fluid motion fine, I don't think AMF copies the frame around many times.
much heavier script:
SetFilterMTMode("DEFAULT_MT_MODE",2) SetFilterMTMode("AvsFilterSource",3) AvsFilterSource() ChangeFps(240) LanczosResize(4096,4096) Prefetch(10)
gives ~160 fps with both versions and 100% CPU load
Hello. Sorry to get back to you a bit late. You actually provided a good example to test with. Here's my result.
Again, my CPU is Ryzen 7 3700X.
With the exact "heavy script",
All tests have no choppy audio, no out-of-sync audio, no hanging. Seeking works without any problem. Process frame number and output frame number are within reasonable difference. One thing to note that the debug version of the filter performs very poorly in this case.
Ironically I found that with the single thread version, the actual FPS is 248.3. The FPS is correctly 200 in worker thread version.
If you concern about concurrency performance, first the filter will default the thread count to 1 if no registry value overrides it, and second you can always explicitly set the thread count to 1.
I've tested the current worker thread version for about a month now, and haven't seen any problem. So I'm going to merge it to master, unless you find any bug or other concerns.
AMD Fluid motion fine, I don't think AMF copies the frame around many times.
AMF is different because everything happens within graphic card. AviSynth always requires CPU to process frames, so we have to copy frame buffer in and out of graphic memory.
So I'm going to merge it to master, unless you find any bug or other concerns.
actually you may notice I deployed 0.7.0 via SVP update two days ago ;)
Sorry I didn't until you just mentioned. I assume you built from the worker branch? Any bug report or feedback from your users? If everything works fine, I'm going to merge today.
https://www.svp-team.com/forum/viewtopic.php?pid=76862
I'm getting the error in the following screenshot when I close potplayer after playing a video with svp off. It will even give me this error if e.g. i turn off DXVA while a video is playing. Potplayer + madvr, updated to AVSF using update tool in SVP manager. Manually updating to Avisynth+ 3.6.1 did not help https://i.imgur.com/C8455LL.png EDIT: Found workaround, if anyone is having the same issue, simply use AVS+ 3.6 and downgrade the filter to v0.60
I assume it is the same as https://github.com/CrendKing/avisynth_filter/issues/9? And for 0.6.1, not worker branch right? I can't reproduce the problem, so it'd be nice if they can try the worker branch and set "LogFile" in registry to generate a log. In 0.6, log file needs to be toggled in source file.
ah, yeah, looks the same, sorry :D
there's possible another crash, with MediaPortal (32-bit only) + madVR + HDR video but I can't reproduce it either
anyway, 0.7.0 is better than 0.6.0 so I'd merge it right now
=== btw there was a video that breaks (freezes) 0.6.0 due to unusual frame time stamps, I suppose. It works in 0.7.0 but probably there's another room for improvement here, if you want you can check this out: https://yadi.sk/i/2hxUFhuq3hRBBQ
I assume it is the same as #9? And for 0.6.1, not worker branch right? I can't reproduce the problem, so it'd be nice if they can try the worker branch and set "LogFile" in registry to generate a log. In 0.6, log file needs to be toggled in source file.
Here is log generated by adding "LogFile" in regedit
https://drive.google.com/file/d/19lueaItRIHeC_423yTl2FhNbPkxIFBCZ/view?usp=sharing
there was a video that breaks (freezes) 0.6.0
I tried 0.6.0, 0.6.1 and 0.7.0 with SVP doing 2x FRC. None of them freezes. Do I need to somehow seek in this short video to trigger it, or it always happens at beginning? If you can trigger it all the time, log file would be helpful.
This could be a bug still exist in 0.7.
0.7.0 is better than 0.6.0 so I'd merge it right now
No need to rush. I'll try to solve https://github.com/CrendKing/avisynth_filter/issues/9 (possibly a non-issue) then merge.
None of them freezes
sorry, forget to mention in freezes in mpc-be (!) not in mpc-hc. and w/o svp at all.
for some reason 0.7.0 doesn't register/load in Win7 x64 the dependency walker report is there (see the last message): https://www.svp-team.com/forum/viewtopic.php?pid=76877#p76877
All the missing dependencies come from IE related stuff (from shell32.dll), which does not exist in my Windows 10. It is transitive dependency, so I can't do anything about it. He should update with latest VC runtimes. Projects like https://github.com/abbodi1406/vcredist allows you to do all versions AIO, but I don't take responsibility if anything goes wrong. It is not official from Microsoft after all.
Also try Windows Update see if there's any patch.
The thing is 0.6.1 (built by me) works and 0.7.0 built in the very same environment doesn't. And there're TWO people reporting this, and SVP always installs the latest VC runtimes. Strange...
Can you build 0.7.0 yourself and ask them to try? If it works, then we can compare the dependency of our builds, see difference.
It could be our Windows version, VS2019 version, or runtime version.
BTW, you sure you have AviSynth.dll in the same directory as the filter right?
BTW 2, in the next version, I can make a change so that it dynamically load AviSynth.dll instead of static link. It has two pros:
I am experiencing the same problem with v0.7.0 builds on Win7 x64 with both 32bit & 64bit versions and installing newest runtimes from the AIO pack mentioned above made no difference. The only consistent difference i see in my Dependency Walker Images is the red KERNEL32.DLL (SetThreadDescription) that would seem to correspond to the line: "Error: At least one module has an unresolved import due to a missing export function in an implicitly dependent module." Other than that v0.6.1 builds are showing the same warnings despite having no problems registering the filters. I hope this helps with the troubleshooting. PS. I am also unable to load the v0.7.0 filters directly from PotPlayer.
yep, SetThreadDescription it is
OK. I disable that call in release build. Try avisynth_filter.zip
OK. I disable that call in release build. Try avisynth_filter.zip
It works. In addition to registering correctly, loading the filter directly works as well.
in freezes in mpc-be (!) not in mpc-hc. and w/o svp at all.
@chainikdn My main player is MPC-BE and it works fine. And when you say w/o svp, does it involve svpflow?.dll at all or something simple like
AvsFilterSource()
return Prefetch(8)
is sufficient to trigger? I tried both and everything works.
Can you do it and give me the log file when it crashes? Thanks.
While the version above works fine, the newly released v0.7.1 is giving me this error: Once closed, PotPlayer loads normally, aside from AviSynth Filter not being loaded...
@Nuihc88 Where is your AviSynth.dll located? You might find one in your PotPlayer directory. If not, move the one with the filter there and try again.
If problem persists, Can you zip your AviSynth.dll and upload here?
Like the readme says, you need to place the AviSynth.dll in the PotPlayer directory now.
I already have. Tried a few different versions at different directories already, 'player folder', 'filter's folder' & 'system folder', but this is the one i normally use: AviSynth.zip - I'll try with the one bundled with SVP next... EDIT: Problem persists with v3.5 r3043 bundled with SVP, so i don't think it's about the version... EDIT2: If i remove AviSynth.dll files from 'player folder' and 'system folder' a different error is shown: ...So i don't think it's about the path(s) used either.
I don't know why CreateScriptEnvironment2 has different exported names between x64 and x86 builds. AviSynthFilter.zip should work regardless. Once you confirm, I'll release 0.7.2.
The build above appears to be working correctly.
Was double-checking and noticed that the older AviSynth+ version bundled with SVP is now giving this error: I guess there could still be any number of similar issues, all with essentially the same cause, but only affecting specific configurations...
The minimum supported AviSynth+ version is 3.5.x. Which versions work and which not? Please include both 32/64 bits and version number.
EDIT: Try AviSynthFilter.zip
Quickly change SVP settings several times, for example move the "quality/perf" slider ---> video player will hang.
It hangs at
_avsScriptClip = invokeResult.AsClip();
... which means it hangs at destruction of previous clip ... which means it's locked at _sourceFrameAvailCv
_avsScriptClip is still alive after FrameHandler::EndFlush() and GetSourceFrame() still can be called.