aritmos / crabql

Crustacean certified SQL query creation in Rust
MIT License
0 stars 0 forks source link

Rework `CoreExpression::conditions` #16

Closed aritmos closed 4 weeks ago

aritmos commented 1 month ago

The current implementation of CoreExpression::conditions is based on nested iterators, this is not a great choice for two reasons

  1. There is a lot of overhead in the nested iterators
  2. It only allows for the propagation of the conditionals

The current work in 08ab4df has brought to light a simpler and better scenario where conditionals propagate back a richer type. This is done by propagating the checker in a DFS manner instead of flattening an expression's structure. In return the return states of the checker can be propagated back. This allows for the capture of

  1. Errors, such as type mismatches
  2. Error stack traces: adding useful information as to why a type mismatch occured in a complex scenario
  3. Additional information: e.g. a list of all the columns used in the expression, useful for the flat-subquery feature that is to be added in the future.
aritmos commented 1 month ago

Proposal:

Code Gist

Common core types

#[derive(Clone, Copy, PartialEq, Eq)]
pub enum ExprType {...}
pub enum Dialect {...}
pub enum Condition {...}

Checker

pub struct ExprInfo {
    // e.g.
    // - used column ids to know when to require a subquery
    cols: Vec<String>,
}
pub enum ExprErr {
    UnSealed(usize),
    Sealed(usize),
}

pub trait Checker {
    type Cond;
    type Ctx;
    fn validate(&mut self, cond: Self::Cond, ctx: Self::Ctx);
    fn flush(&mut self) -> Result<ExprInfo, ExprErr>;
}

Expressions

pub trait Checkable {
    type Cond;
    type Ctx: Clone;

    // returns references to children objects that need to be checked
    // (children can be of different types and might require other contexts)
    #[allow(clippy::type_complexity)] // no way around it
    fn children(
        &self,
        ctx: Self::Ctx,
    ) -> Vec<(
        &dyn Checkable<Cond = Self::Cond, Ctx = Self::Ctx>,
        Self::Ctx,
    )>;

    // returns the conditions needed to verify itself assuming children have already been checked
    fn conditions(&self, ctx: Self::Ctx) -> Vec<Self::Cond>;

    // checker is not an associated type as we want multiple checkers for the same expr
    fn check(&self, checker: &mut dyn Checker<Cond = Self::Cond, Ctx = Self::Ctx>, ctx: Self::Ctx) {
        // check children
        for (child, ctx) in self.children(ctx.clone()) {
            // if coerce is inside ctx, then each ctx needs to be possibly mutated per child
            child.check(checker, ctx);
        }

        // check self
        for cond in self.conditions(ctx.clone()) {
            checker.validate(cond, ctx.clone());
        }
    }
}

// fix checkable associated types throughout all expressions
pub trait Expression: Checkable<Cond = ..., Ctx = ...> {
    fn display(&self, dialect: Dialect) -> String;
}

Example Impl

// Context type
use std::{cell::RefCell, rc::Rc};
#[derive(Clone)]
pub struct Ctx {
    // shared information (immutable during `check`)
    // - e.g. current table selection
    tables: Rc<RefCell<() /* <- replace with some type */>>,
    // individual information (overridable during `check`)
    // - e.g. expression coercion
    pub coerce: ExprType,
}
impl Ctx {
    // Returns a clone of self with the given coercion
    pub fn set_coerce(&self, coerce: ExprType) -> Self {
        let mut ctx = self.clone();
        ctx.coerce = coerce;
        ctx
    }
}

// Faciliate CoreExpression coercion verification
macro_rules! debug_assert_coerce {
    ($self:expr, $ctx:expr) => {
        debug_assert!({ $ctx.coerce == ExprType::Any || $ctx.coerce == $self.eval_type() })
    };
}

pub struct GtExpr {
    lhs: Box<dyn Numeric>,
    rhs: Box<dyn Numeric>,
}
impl Checkable for GtExpr {
    type Cond = Condition;

    type Ctx = Ctx;

    fn children(
        &self,
        ctx: Self::Ctx,
    ) -> Vec<(
        &dyn Checkable<Cond = Self::Cond, Ctx = Self::Ctx>,
        Self::Ctx,
    )> {
        vec![
            (self.lhs.as_ref(), ctx.set_coerce(ExprType::Num)),
            (self.rhs.as_ref(), ctx.set_coerce(ExprType::Num)),
        ]
    }

    fn conditions(&self, ctx: Self::Ctx) -> Vec<Self::Cond> {
        debug_assert_coerce!(self, ctx);
        // no self conditions
        Vec::new()
    }
}
aritmos commented 4 weeks ago

Resolved by 5e9a7af