The core Seafowl logic found in DefaultSeafowlContext struct implementation and the accompanying SeafowlContext trait is long overdue for a refactor.
On one hand, there's no need to have a separate SeafowlContext trait which can be confusing to newcomers, as all logic should really go into DefaultSeafowlContext (which should then be renamed to SeafowlContext).
The only reason this was kept was because there are recursive calls inside SeafowlContext::create_physical_plan in some cases, and since this is an async function this only works for traits, whereas for structs it leads to compilation issues: https://rust-lang.github.io/async-book/07_workarounds/04_recursion.html
However, this can easily be resolved in any of three ways:
Instead of calling self.create_physical_plan call self.inner.state().create_physical_plan, thus breaking the recursion. This is safe to do, since currently all custom planning in SeafowlContext::create_physical_plan results in a dummy (empty) plan and those are never used as inputs anywhere else (i.e. we only handle root plan nodes there, and the input planning can be delegated to DataFusion).
If we do end up needing to do actual recursion there at some point then it's enough to decorate the method with #[async_recursion] as is done in DataFusion.
Finally if we don't want to rely on a third party crate to facilitate this recursion there's always the async move + BoxFuture route.
On the other hand the entire logic is kept in a single file with 2.5K+ lines (with tests), while it really should be broken up into logically distinct groups of impls (i.e. logical planning, physical planning, delta operations etc.) across several files in a separate module.
The core Seafowl logic found in
DefaultSeafowlContext
struct implementation and the accompanyingSeafowlContext
trait is long overdue for a refactor.SeafowlContext
trait which can be confusing to newcomers, as all logic should really go intoDefaultSeafowlContext
(which should then be renamed toSeafowlContext
). The only reason this was kept was because there are recursive calls insideSeafowlContext::create_physical_plan
in some cases, and since this is an async function this only works for traits, whereas for structs it leads to compilation issues: https://rust-lang.github.io/async-book/07_workarounds/04_recursion.html However, this can easily be resolved in any of three ways:self.create_physical_plan
callself.inner.state().create_physical_plan
, thus breaking the recursion. This is safe to do, since currently all custom planning inSeafowlContext::create_physical_plan
results in a dummy (empty) plan and those are never used as inputs anywhere else (i.e. we only handle root plan nodes there, and the input planning can be delegated to DataFusion).#[async_recursion]
as is done in DataFusion.impl
s (i.e. logical planning, physical planning, delta operations etc.) across several files in a separate module.