apache / datafusion

Apache DataFusion SQL Query Engine
https://datafusion.apache.org/
Apache License 2.0
5.54k stars 1.03k forks source link

Make TaskContext wrap SessionState #10631

Open lewiszlw opened 1 month ago

lewiszlw commented 1 month ago

Is your feature request related to a problem or challenge?

I have a user defined table function FlattenTableFunc whice needs reading data from other table provider. To serialize execution plan FlattenExec, I can't put SessionState in it directly. So I have to reconstruct SessionState in FlattenExec::execute method.

Describe the solution you'd like

I see TaskContext is just part of SessionState. Could we just make TaskContext wrap SessionState?

Describe alternatives you've considered

No response

Additional context

No response

crepererum commented 1 month ago

I would suggest a rather larger refactoring? We have:

So it seems that someone tried to model a hierarchy runtime->session->task but TBH the types are pretty inconsistent and I don't know what we would need a "task" for. I kinda see why someone would differentiate between a "runtime" (long-lived) and a "session" (request/query-scoped).

I also don't understand why there is a "state" and a "context".

alamb commented 1 month ago

I see TaskContext is just part of SessionState. Could we just make TaskContext wrap SessionState?

I can't remember why TaskContext doesn't wrap SessionState -- maybe @tustvold does 🤔

At some point SessionState was in datafusion (the core crate) so it couldn't be referenced by subcrates.

tustvold commented 1 month ago

IIRC SessionConfig is the static configuration used to create a SessionContext, which is an interior mutable wrapper around SessionState.

The idea was a query is planned against an immutable SessionState to avoiding inconsistency during planning.

RuntimeEnv is then for things shared across multiple sessions.

TaskContext, I believe was added for Ballista where the execution takes place on different nodes to planning and therefore not associated with a session.

The separate ConfigOptions IIRC is a historical artifact and could be merged / replace SessionConfig.