roc-streaming / roc-toolkit

Real-time audio streaming over the network.
https://roc-streaming.org
Mozilla Public License 2.0
1.06k stars 213 forks source link

Prevent task starvation in ctl::TaskQueue #380

Closed gavv closed 3 years ago

gavv commented 4 years ago

ctl::TaskQueue is a thread-safe task queue, allowing to schedule task for immediate or delayed execution.

When it selects a task for execution, it always first checks the queue for immediate execution, and if it's empty, it then checks the queue for delayed execution.

While this is okay in most cases, there is a potential problem here. If the tasks for immediate execution will be added frequently enough during some period, tasks for delayed execution wont be handled during this period at all, even if their deadline is already expired.

A smoother behavior would be a more fair selection of tasks, for example: process one task from immediate queue, then process one task from delayed queue, and so on.

The new behavior should be covered with a unit test. Existing unit tests should be updated accordingly.

yjheo15 commented 4 years ago

Hello, I'm interested in this issue. May I know if you have a more specific guideline for the 'smoother behavior' you want? Are there specific limitations?

gavv commented 4 years ago

@yjheo15 Hi, you're welcome.

The simplest solution coming in mind is to process one issue from first queue, then one issue from second, then one from first again, etc.

Another approach is to do batching: process issues from first queue, but no more than N; then go to the second queue, and so on. This sometimes has a positive impact on throughput, but probably not in our case. Needs a closer look to the implementation or benchmarking.

When doing batching, we can also rely on task deadline instead of number of tasks. For example, fetch next delayed task deadline and process immediate tasks until the deadline is close.

Maybe there are other options too. What approach to choose is up to you. The simplest solution will be okay too, since it's already better than the current one. If you decide to try more complicated solution like batching, we'll need to understand if we really need it, e.g. by writing benchmark demonstrating how it increases performance.

Are you going to work on this issue?

yjheo15 commented 4 years ago

If no one else is willing to, I'll give it a shot! I'll first start with the easy implementation of having two queues then move on to batching or other solutions. May I know how to find which part of the files I would have to work on? This is my first time participating in open source so I'm slightly lost after forking.

gavv commented 4 years ago

@yjheo15 Great!

May I know how to find which part of the files I would have to work on?

I guess you did already find it, but the current behavior:

When it selects a task for execution, it always first checks the queue for immediate execution, and if it's empty, it then checks the queue for delayed execution.

is coded here:

https://github.com/roc-streaming/roc-toolkit/blob/d619f2533e9bab12af68b02ce9b2c781856b80b9/src/modules/roc_ctl/task_queue.cpp#L161-L166

gavv commented 4 years ago

@yjheo15 Hi, do you still have plans on this?

stevenzhu94 commented 3 years ago

fetch next delayed task deadline and process immediate tasks until the deadline is close.

@gavv how close should the threshold be? If the threshold is made too big, the opposite issue of the sleeping tasks starving the ready tasks could happen.

@yjheo15 If you're no longer interested in working on this issue I'd love to give it a try. It would be my first contribution to an open source project as well

yjheo15 commented 3 years ago

@gavv Hello, sorry for the late reply. I tried to set up but lost track of it. @stevenzhu94 Yes! I don't think I'll be able to work on this right now so feel free to take it up!

gavv commented 3 years ago

@yjheo15 thanks for reply, I'm unassigning the task then.

gavv commented 3 years ago

@stevenzhu94 you're welcome,

@gavv how close should the threshold be? If the threshold is made too big, the opposite issue of the sleeping tasks starving the ready tasks could happen.

Hm, this will require us to introduce a configuration option with expected task execution time. If deadline is closer than this time, we shouldn't try immediate tasks.

While this may work, I'd like to avoid introducing such option if possible, since it would be hard to maintain it (different tasks may have different execution times on different platforms). Also, it still not clear to me whether batching will have any significant advantages here or not.

So I think I'd start with the simplest approach and probe just probe first queue, then second, then first again, and so on.

stevenzhu94 commented 3 years ago

@gavv sounds like a plan. Please assign me to this.