apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
6.33k stars 1.2k forks source link

Improve documentation (and ASCII art) about streaming execution, and thread pools #13423

Closed alamb closed 3 days ago

alamb commented 1 week ago

Which issue does this PR close?

Rationale for this change

In order to understand the need for a second runtime, I needed to write up the background material so it made sense.

Also, this is starting to come up with dft:

What changes are included in this PR?

This PR adds background documentation on how DataFusion runs plans and why a separate Runtime may be needed to keep the network busy

Note I am also working on an example to show how to actually use a second runtime which I will link to these docs when it is ready

Also, I found myself on a ✈️ without WIFI so I also made a bunch of ASCII art while I was at it)

Are these changes tested?

By CI

Are there any user-facing changes?

Nice documentation hopefully!

2010YOUY01 commented 6 days ago

The explanation for the problem is super clear

One thing I don't understand is: For Tokio's convention, all tasks created by spawn are expected to be IO-bounded, and those CPU-bound task should be created using spawn_blocking(). I think the implementation also uses two separate thread pool, which is very similar to the approach in this doc. Why can't we all use spawn_blocking() for all CPU-bounded task, and instead we have to use two runtimes explicitly 🤔 If we follow this two runtime approach, would existing spawn_blocking() in the code (e.g. https://github.com/apache/datafusion/blob/6d8313ebc865f9bff007bfc04652f58b016cbc1b/datafusion/physical-plan/src/spill.rs#L53) cause unexpected behavior (3 thread pools co-exist)

tustvold commented 6 days ago

I took a quick look at this and content looks good, couldn't check diagrams as on phone.

Why can't we all use spawn_blocking() for all CPU-bounded task, and instead we have to use two runtimes explicitly

We could, however, to preserve the thread per core architecture we would need to cap the threads of the blocking pool to the core count. Then as blocking tasks can't yield waiting for input, every CPU bound morsel would have to be spawned separately. This would give us a "morsel-driven" scheduler, however, tokio has a relatively high per task overhead and so even discounting the sheer amount of boilerplate this would require, the performance would be regrettable. If going down this path you might as well switch to an actual morsel driven scheduler (although this is at this stage likely intractable).

Ultimately spawn_blocking is designed for blocking IO, it is not designed for CPU bound tasks.

alamb commented 5 days ago

Why can't we all use spawn_blocking() for all CPU-bounded task, and instead we have to use two runtimes explicitly 🤔

Thank you @2010YOUY01 for the question and @tustvold for the answer -- this question also has come up in the past so I added a summary in ad161794d76731d4472bec3c77c0e751affd1f46

alamb commented 3 days ago

Thank you everyone for the comments -- I am going to merge this one in and make some adjustments as a follow on PR to avoid keeping it open for too long. Long live comments!