The current executor will use a single queue for not-ready and ready tasks.
So, every task posted to the queue may wake all threads.
Better split it into two queues: 1 for not ready tasks. 1 for ready tasks. it is protected by different locks
The thread adds a task to the executor who will also check readiness for all tasks in the not-ready queue.
once it is ready it will add to the ready queue.
The current executor will use a single queue for not-ready and ready tasks. So, every task posted to the queue may wake all threads. Better split it into two queues: 1 for not ready tasks. 1 for ready tasks. it is protected by different locks The thread adds a task to the executor who will also check readiness for all tasks in the not-ready queue. once it is ready it will add to the ready queue.
the target is reduce the lock an unlock usage in perf top result 11.96% ffmpeg_g [.] put_vvc_luma_hv_10 5.88% ffmpeg_g [.] alf_get_coeff_and_clip_10 5.25% ffmpeg_g [.] ff_vvc_inv_dct2_64 4.30% [kernel] [k] lock_text_start 4.22% ffmpeg_g [.] ff_vvc_alf_filter_luma_w16_16bpc_avx2 3.46% ffmpeg_g [.] put_vvc_luma_bi_hv_10 3.45% ffmpeg_g [.] alf_filter_luma_vb_10 3.13% ffmpeg_g [.] vvc_loop_filter_luma_10 2.81% ffmpeg_g [.] lmcs_filter_luma_10 2.46% ffmpeg_g [.] put_vvc_luma_uni_hv_10 2.27% ffmpeg_g [.] put_vvc_chroma_hv_10 2.21% libc-2.31.so [.] 0x000000000018b733 2.05% libc-2.31.so [.] 0x000000000018bb41 1.95% ffmpeg_g [.] put_vvc_chroma_uni_hv_10 1.84% ffmpeg_g [.] put_vvc_chroma_bi_hv_10 1.81% ffmpeg_g [.] vvc_deblock_bs 1.41% ffmpeg_g [.] ff_vvc_predict_inter **1.25% libpthread-2.31.so [.] pthread_mutex_lock 1.24% libpthread-2.31.so [.] __pthread_mutex_unlock** 1.22% ffmpeg_g [.] ff_vvc_residual_coding 1.08% ffmpeg_g [.] alf_filter_cc_10 1.03% ffmpeg_g [.] apply_prof_uni_10 0.99% ffmpeg_g [.] ff_vvc_alf_filter 0.98% ffmpeg_g [.] ff_vvc_inv_dct2_32 0.94% ffmpeg_g [.] vvc_deblock_bs_luma_vertical 0.92% ffmpeg_g [.] add_residual_10
see https://github.com/ffvvc/FFmpeg/wiki/Introduction-to-the-code-structures#executorc for ref.