AviSynth / AviSynthPlus

AviSynth with improvements
http://avs-plus.net
958 stars 72 forks source link

Thread safe way to reload script with prefetcher? #180

Open CrendKing opened 4 years ago

CrendKing commented 4 years ago

Suppose I have a script like this:

MySourceFilter()
GenerateSomeFrames(thread=4)
Prefetch(4)

I use the C++ API IScriptEnvironment::Invoke("Import", ...)to load the script and call GetFrame(). Everything is fine at first. Then suppose I change the script a bit (e.g. add a Subtitle() in the middle) and I want to reload the script without restarting the whole process.

If I simply repeat the Invoke(), sometimes I get crash with stack like this (not every time):

ucrtbased.dll!free_dbg_nolock(void * const block, const int block_use) Line 908
ucrtbased.dll!_free_dbg(void * block, int block_use) Line 1030
AviSynth.dll!operator delete(void * block) Line 38
AviSynth.dll!operator delete(void * block, unsigned __int64 __formal) Line 32
AviSynth.dll!std::_Deallocate<16,0>(void * _Ptr, unsigned __int64 _Bytes) Line 195
AviSynth.dll!std::_Default_allocator_traits<std::allocator<std::_List_node<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>,void *>>>::deallocate(std::allocator<std::_List_node<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>,void *>> & __formal, std::_List_node<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>,void *> * const _Ptr, const unsigned __int64 _Count) Line 667
AviSynth.dll!std::_List_node<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>,void *>::_Freenode0<std::allocator<std::_List_node<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>,void *>>>(std::allocator<std::_List_node<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>,void *>> & _Al, std::_List_node<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>,void *> * _Ptr) Line 312
AviSynth.dll!std::_List_node<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>,void *>::_Freenode<std::allocator<std::_List_node<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>,void *>>>(std::allocator<std::_List_node<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>,void *>> & _Al, std::_List_node<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>,void *> * _Ptr) Line 318
AviSynth.dll!std::_List_node<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>,void *>::_Free_non_head<std::allocator<std::_List_node<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>,void *>>>(std::allocator<std::_List_node<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>,void *>> & _Al, std::_List_node<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>,void *> * _Head) Line 330
AviSynth.dll!std::list<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>,std::allocator<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>>>::_Tidy() Line 1442
AviSynth.dll!std::list<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>,std::allocator<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>>>::~list<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>,std::allocator<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>>>() Line 1049
AviSynth.dll!std::_Hash<std::_Umap_traits<char *,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>,std::_Uhash_compare<char *,std::hash<char *>,std::equal_to<char *>>,std::allocator<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>>,0>>::~_Hash<std::_Umap_traits<char *,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>,std::_Uhash_compare<char *,std::hash<char *>,std::equal_to<char *>>,std::allocator<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>>,0>>()
AviSynth.dll!std::unordered_map<char *,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>,std::hash<char *>,std::equal_to<char *>,std::allocator<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>>>::~unordered_map<char *,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>,std::hash<char *>,std::equal_to<char *>,std::allocator<std::pair<char * const,std::_List_iterator<std::_List_val<std::_List_simple_types<char *>>>>>>()
AviSynth.dll!ObjectPool<LruCache<unsigned __int64,PVideoFrame>::LruEntry>::~ObjectPool<LruCache<unsigned __int64,PVideoFrame>::LruEntry>() Line 127
AviSynth.dll!LruCache<unsigned __int64,PVideoFrame>::~LruCache<unsigned __int64,PVideoFrame>()
AviSynth.dll!LruCache<unsigned __int64,PVideoFrame>::`scalar deleting destructor'(unsigned int)
AviSynth.dll!std::_Destroy_in_place<LruCache<unsigned __int64,PVideoFrame>>(LruCache<unsigned __int64,PVideoFrame> & _Obj) Line 243
AviSynth.dll!std::_Ref_count_obj2<LruCache<unsigned __int64,PVideoFrame>>::_Destroy() Line 1504
AviSynth.dll!std::_Ref_count_base::_Decref() Line 651
AviSynth.dll!std::_Ptr_base<LruCache<unsigned __int64,PVideoFrame>>::_Decref() Line 883
AviSynth.dll!std::shared_ptr<LruCache<unsigned __int64,PVideoFrame>>::~shared_ptr<LruCache<unsigned __int64,PVideoFrame>>() Line 1132
AviSynth.dll!PrefetcherPimpl::~PrefetcherPimpl() Line 89
AviSynth.dll!PrefetcherPimpl::`scalar deleting destructor'(unsigned int)
AviSynth.dll!Prefetcher::Destroy() Line 143
AviSynth.dll!Prefetcher::~Prefetcher() Line 150
AviSynth.dll!Prefetcher::`scalar deleting destructor'(unsigned int)
AviSynth.dll!IClip::Release() Line 614
AviSynth.dll!PClip::Set(IClip * x) Line 632
AviSynth.dll!PClip::OPERATOR_ASSIGN1(const PClip & x) Line 648
avisynth_filter.ax!PClip::operator=(const PClip & x) Line 1131

My theory is this: when reassigning the PClip variable, the refcount goes 0, so it decides to release the PClip first. Because the script has Prefetch(), the PClip has Prefetcher in it. The release procedure simply destroys the object. However, because the 4 prefetcher threads are still actively caching frames, this causes a race condition.

I've tried to change the MT mode of MySourceFilter() to all other modes. Some modes result less crashes than others, but all crash eventually if repeat enough. PClip::SetCacheHints() also seems doing nothing. Most importantly, I don't find any API to shutdown the prefetcher thread pool. I think IScriptEnvironment::DeleteScriptEnvironment() shutdown all thread pools, but then the internal pointers of the PClip will all become invalid.

Can we have some API that can safely release a PClip regardless if there is prefetcher or not?

pinterf commented 3 years ago

Not forgotten but this issue was/is too compex for me at the moment. Both to understand the real reasons and figure out what's behind it, need more time. edit: can you provide me steps / tools to reproduce it?

CrendKing commented 3 years ago

Basically I was asking for more granular control of the internal memory and thread management through API. For example,

  1. give us API to just shutdown prefetcher threads
  2. API to just flush cached video frames
  3. API to tell AviSynth the source filter will cache frames so it should not
  4. API to tell GetFrame() that "I can't return a frame at the moment"

I believe currently (1) and (2) happen inside DeleteScriptEnvironment(), not granular and not callable. (3) and (4) do not exist.

Example use case: instead of having a workflow as standard as

Create avs environment
Import avs script
Read from video file, go through each frame, do some work
Delete avs environment

some client wants to

Create avs environment
Import avs script
Call API (3)
loop_start:
  Read from DirectShow, cache samples
  Convert DS samples to AVS frames, provide to AviSynth
  Go through each frame, do some work
  If (DS wants to flush or shutdown) {
    Call API (4)
  }
  Call API (1) to stop AviSynth
  Call API (2) to flush cache
  // here avs should be in a state as clean as when the loop starts,
  // no lingering thread, no stale cache, no outdated TLS, etc,
  // but retain everything related to the environment, plugins and script
  If (shutdown) {
    Delete avs environment
  } else {
    goto loop_start
  }

Currently, my code in that client has workaround for each of the needed API. It works, but definitely not ideal.

I know this is a big feature request, so I don't expect it to happen, at least not soon. If you decide to won't fix, it will not hurt me 😃

pinterf commented 3 years ago

I see. What is the penalty of "restarting the whole process?"

CrendKing commented 3 years ago

When you say "process", you mean the process as in "process and thread" or as in "process of recreating avs environment"? If former, we can't, because the client is just a dll. It wouldn't be appropriate to shutdown a video player just because user wants to seek. If latter,

  1. without workaround, it may crash or freeze because the lack of the 4 APIs. The original comment shows one possible outcome due to lack of API (1), but all 4 are needed.
  2. the use case for the client is interactive real-time application (video player). We want to minimize the downtime to the minimum. If user needs to wait for 200 ms every time they seek instead of 20 ms, it's bad user experience. Recreating the whole environment, reimporting script, re-setup the whole image format, etc takes time, especially toll for non-beefy computers.

EDIT: Added a comment in the pseudo code. Whether (1) and (2) are separate granular APIs or just one giant API is not important. I just need API to reset avs. However, granular API has advantage that if they are not thread safe or has dependency caller can make judgment of when to call which first and last.

pylorak commented 3 years ago

Apparently, the intention of APIs 1. and 2. (stop threads and flush caches) would be to return avs to a pristine state as if the script and the necessary plugins just got loaded, but without the overhead of actually reloading them. The problem is this won't be enough, because plugins can keep internal state that avs doesn't know about, and there is no API currently to reset plugin-internal state, so all existing plugins would need a close inspection and retrofitting (when needed) to make this work consistently.

But maybe more importantly, ATM I do not see when such reset would be needed at all. If the use-case is fast seeking, then there is no need to flush caches and plugin-internal state at all, only a method is needed to enable or disable prefetching. Then the host app could disable the prefetcher while seeking and enable it again when playback starts. (This is purely a performance optimization though, and there should be no crash in either case of course.)

IMHO the only possible use-case for a whole-pipeline-reset would be to change the video material without reloading the script and the plugins (because in this case the same frame numbers would deliver different frame contents after changing the source, which obviously breaks previous coding assumptions). However, when the user loads a new video, he doesn't expect it to happen in 20ms and is okay if video loading takes 2 seconds, so I'm not convinced that this is worth the additional complexity.

As for requested API 4. ("I can't return a frame at the moment"), this seems to be a feature that only makes sense for multi-threaded host applications, and can be worked around in the host by putting avs on a separate thread that is async to the rest of the app, doing the blocking there and so on. I know this is more complex for the host than just directly supporting the API in avs, but it is generally true for almost all libraries (with only a few exceptions) that the libraries don't provide threading support in the host (only inside the library itself), and it is the responsibility of the application programmer to implement such things. I'm not convinced why AVS should be or make an exception for this.

But then again, I'm not involved in the development any more, I'm just adding my two cents. Also, this whole discussion about the requested APIs is irrelevant to Invoke() crashing. I suggest you open a separate feature request for the APIs, and reserve the current issue only for solving the crash.

CrendKing commented 3 years ago

If the use-case is fast seeking, then there is no need to flush caches and plugin-internal state at all

Please educate me how to deal with this common case. AviSynth does not care about time offset. It only cares about frame number. When user seek a video, we have to pretend as we are opening a new video and start from frame number 0. Otherwise if user seeks backward by say 5 frames, the used-to-be frame no.0 is now frame no.4, completely messing up the sequencing. However, without flushing caches, after seek the new frame no.0 is different to the old frame no.0. We end up getting "ghost" frames. You can easily see this from ffdshow.

Another common case. Upstream or downstream DirectShow filters could issue format change at any time. The dimension and color format could change. When this happens, all old cached frames must be invalidated. The script must also be reloaded.

Oh, and BTW, AviSynth always assumes there are finite number of frames. However, it is common nowadays to watch network streams which have indefinite number of frames. We (and ffdshow) have to lie about the number. Even though it works, I would be happier if AviSynth can do something about it.

separate thread that is async to the rest of the app, doing the blocking there and so on

I'm already doing it, and it is not enough. When the process shuts down, you need to return something from that blocking state, or there is dead lock. When you return null, it crashes. The workaround I'm using is to create an empty video frame at beginning and hold onto it, and return it in this situation. Again, it works but certainly can improve if AviSynth can handle it gracefully. For example, allow returning a special value, throw specific exception, etc.

pylorak commented 3 years ago

Please educate me how to deal with this common case. AviSynth does not care about time offset. It only cares about frame number. When user seek a video, we have to pretend as we are opening a new video and start from frame number 0. Otherwise if user seeks backward by say 5 frames, the used-to-be frame no.0 is now frame no.4, completely messing up the sequencing. However, without flushing caches, after seek the new frame no.0 is different to the old frame no.0. We end up getting "ghost" frames. You can easily see this from ffdshow.

Aaaah, I think your problem is somewhere else. Your source is not frame-accurate (this was always a known problem with DirectShow). That's a scenario that was never intended to be supported in AVS, and is the reason why the ffms2 source plugin needs to index the video before delivering frames, for example.

Another common case. Upstream or downstream DirectShow filters could issue format change at any time. The dimension and color format could change. When this happens, all old cached frames must be invalidated. The script must also be reloaded.

Here, again, flushing the caches would not be enough, the script must reload completely, because many plugins fixate to a specific frame size and format at construction time.

The general issue with all these APIs relating to flushing and pipeline reset is that you're trying to do something that AVS was never intended to do, which is basically using it as a real-time media player instead of an editing tool. Same case for streaming. This doesn't mean at all that your request is invalid or that your wishes won't be implemented, it just means current devs need to be clear about where AVS wants to go, if it wants to go there, and if so, what is the best way to reach it. What I'm saying is, maybe your feature request should not be about implementing specific APIs, but about better support for non-frame-accurate sources for example. Then the community can discuss what is the best way to do it and what compromises can be accepted for backwards compatibility. In a separate ticket.

separate thread that is async to the rest of the app, doing the blocking there and so on

I'm already doing it, and it is not enough. When the process shuts down, you need to return something from that blocking state, or there is dead lock. When you return null, it crashes. The workaround I'm using is to create an empty video frame at beginning and hold onto it, and return it in this situation. Again, it works but certainly can improve if AviSynth can handle it gracefully. For example, allow returning a special value, throw specific exception, etc.

What's the problem with returning an empty frame? That's what I'd do and what I'd recommend. It solves the waiting problem without introducing an error, is very easy to do, and the fact that the frame contents are wrong doesn't matter because you are shutting down the pipeline anyway. What would you gain by adding a separate API for this?

CrendKing commented 3 years ago

the script must reload completely

I agree with this, and I wrote myself. However, for seeking script should not be loaded but we have to just to flush cache atm.

it just means current devs need to be clear about where AVS wants to go

Again, completely agree. The APIs are just examples to help you understand the need.

What's the problem with returning an empty frame?

It crashes will read access violation to null. Example stack trace on Avs+ r3325:

    AviSynth.dll!VideoFrame::GetFrameBuffer() Line 531  C++
    AviSynth.dll!Cache::GetFrame(int n, IScriptEnvironment * env_) Line 222 C++
    AviSynth.dll!CacheGuard::GetFrame(int n, IScriptEnvironment * env) Line 505 C++
    avisynth_filter_64.ax!AvsFilter::FrameHandler::ProcessOutputSamples() Line 358  C++
    avisynth_filter_64.ax!std::invoke<void (__cdecl AvsFilter::FrameHandler::*)(void),AvsFilter::FrameHandler *>(void(AvsFilter::FrameHandler::*)() && _Obj, AvsFilter::FrameHandler * && _Arg1) Line 1614  C++
    avisynth_filter_64.ax!std::thread::_Invoke<std::tuple<void (__cdecl AvsFilter::FrameHandler::*)(void),AvsFilter::FrameHandler *>,0,1>(void * _RawVals) Line 44  C++
    ucrtbased.dll!thread_start<unsigned int (__cdecl*)(void *),1>(void * const parameter) Line 97   C++
    kernel32.dll!BaseThreadInitThunk()  Unknown
    ntdll.dll!RtlUserThreadStart()  Unknown

If null was intended to be able to return from GetFrame(), then this is a bug. Otherwise, this is feature request.

pinterf commented 3 years ago

I'm willing to check the crash, but again, I need exact steps and description to reproduce.

CrendKing commented 3 years ago

I never did this, but if you put a return nullptr; at https://github.com/AviSynth/AviSynthPlus/blob/e08d7c84179eb9a50b58736c78aeb1a4f41467ec/avs_core/filters/AviSource/avi_source.cpp#L1296, and use AviSource in a standard avs script, what is the expected result?

If for whatever reason AviSource is special and this test is not applicable, I can provide steps to reproduce with my filter.

pinterf commented 3 years ago

I suppose you are not allowed to return a nullptr from GetFrame.

CrendKing commented 3 years ago

Just tried the AviSource change and got the exactly same stack trace. Thanks for confirming.