Provide user-friendly and fail-safe API by default (while keeping current API as escape hatch for advanced usage)

Having just stumbled upon this comment https://github.com/s-arash/ascent/issues/30#issuecomment-1986937702, according to which appears that the leaky abstraction of the current API is intentional:

I have purposefully not hidden the internals of Ascent's relations and indices so they can be tinkered with by those who know what they are doing.

I would thus instead like to propose introducing an additional higher-level and fail-safe abstraction around the existing API that would serve the needs of 80% of users, while still providing access to the current lower-level and more error-prone API as an escape hatch for the remaining 20% of users who really need full control over the program and its internal state:

Steps

rename the existing ascent!/AscentProgram to ascent_runtime!/AscentProgramRuntime.
re-introduce a new high-level abstraction ascent!/AscentProgram.
introduce AscentProgramFacts
make ascent_run! be based on the new ascent!, rather than ascent_runtime!, as it is currently.

API Example

Name bike-shedding aside the new fail-safe wrapper API generated from an ascent! { … } program with relations edge and path …

Program's full code snippet

```rust use ascent::ascent; ascent! { relation edge(i32, i32); relation path(i32, i32); path(x, y) <-- edge(x, y); path(x, z) <-- edge(x, y), path(y, z); } ```

… would look and work like this:

#[derive(Default)]
pub struct AscentProgramFacts {
    pub edge: HashSet<(i32, i32)>,
    pub path: HashSet<(i32, i32)>,
}

pub struct AscentProgram {
    runtime: AscentProgramRuntime,
}

impl AscentProgram {
    /// Creates a program from a set of initial facts (i.e. EDB).
    pub fn from_facts(facts: AscentProgramFacts) -> Self {
        let runtime = AscentProgramRuntime::default();
        runtime.edge.extend(facts.edge);
        runtime.path.extend(facts.path);
        Self { runtime }
    }

    /// Runs the program to completion and returns its final facts, consuming `self`.
    pub fn run(self) -> AscentProgramFacts {
        self.runtime.run();
        AscentProgramFacts {
            edge: HashSet::from_iter(self.runtime.edge);
            path: HashSet::from_iter(self.runtime.path);
        }
    }
}

The above would then be used like this:

By either initializing the initial facts step by step:

fn main() {
    let mut initial_facts = AscentProgramFacts::default();
    initial_facts.edge = HashSet::from_iter([(1,2), (2,3)]);
    let program = AscentProgram::from_facts(initial_facts);
    let facts = program.run();
    println!("{}", facts.path);
}

Or by passing fully initialized initial facts all at once:

fn main() {
    let facts = AscentProgram::from_facts(AscentProgramFacts {
        edge: HashSet::from_iter([(1,2), (2,3)]),
        ..Default::default()
    }).run();
    println!("{}", facts.path);
}

(btw, the literal initialization of relations above is something that the current API doesn't allow due to the private fields in AscentProgram)

One could even go a step further and provide a builder API that would make things even shorter, while also improving ergonomics by accepting not just actual sets but arbitrary T: IntoIterator, which the actual sets then get collected from internally, thus preserving proper set semantics:

fn main() {
    let facts = AscentProgramBuilder::default()
        .edge([(1,2), (2,3)])
        .build().run();
    println!("{}", facts.path);
}

(Keep in mind that if this simplified 80% API doesn't cut it and you need the full control of AscentProgramRuntime, then you can still just use ascent_runtime! instead of ascent!.)

Usability/Safety benefits of the high-level API

The "high-level" API provides a fail-safe and straight-forward API that's effectively impossible to use incorrectly, as there is …

… only one way to create a program: via ::from_facts(…), passing in a set of initial facts.
… only one way to run a program: by calling .run(), consuming the program, returning a set of final facts.
… no way to accidentally corrupt a program's state (e.g. by removing/mutating existing facts) (resolving #30).
… no way to accidentally run a program twice (resolving #30).

There are no other methods, nor do there need to be any.

Comparison against the existing API

Now, looking at what's currently necessary to initialize and run an equivalent program … ```rust fn main() { let mut prog = AscentProgram::default(); prog.edge = vec![(1, 2), (2, 3)]; prog.run(); println!("path: {:?}", prog.path); } ``` … one might wonder "fine, but what did we gain by all this? What's wrong about the last snippet? It's short, it's clean." But it's also misleading. Why? Because unlike the "new" API it allows for all kinds of incorrect usages and has a massive API surface with no clear guidance as to which parts of it are actually intended for user access. ## Usability/Safety issues of the low-level API The current "low-level" API can easily be used incorrectly by … - … passing in relations with duplicates, violating Datalog's set semantics. - … messing with the program's internal state, possibly corrupting it. - … removing or mutating facts between repeated runs of the program. The new API does not suffer from any of these.

The proposed API would be extensible as well:

Possible differential future

If at some point ascent was to get proper support for re-entrant, differential programs, then AscentProgram could be extended like so:

impl AscentProgram {
    fn facts(&self) -> AscentProgramFacts {
        AscentProgramFacts {
            edge: HashSet::from_iter(self.runtime.edge),
            path: HashSet::from_iter(self.runtime.path),
        }
    }

    /// Extends the existing facts with the contents of `facts`, avoiding duplicates.
    fn extend_facts(&mut self, mut facts: AscentProgramFacts) {
        self.extend_edge_facts(facts.edge);
        self.extend_path_facts(facts.path);
    }

    /// Extends the existing `edge` facts with the contents of `facts`, avoiding duplicates.
    fn extend_edge_facts(&mut self, mut facts: HashSet<(i32, i32)>) {
        for edge in &self.runtime.edge {
            facts.remove(edge);
        }
        self.runtime.edge.extend(facts.edge);
    }

    /// Extends the existing `path` facts with the contents of `facts`, avoiding duplicates.
    fn extend_path_facts(&mut self, mut facts: HashSet<(i32, i32)>) {
        for path in &self.runtime.path {
            facts.remove(path);
        }
        self.runtime.path.extend(facts.path);
    }

    /// Runs the program to completion, retaining its state.
    pub fn run_mut(&mut self) {
        self.runtime.run();
    }
}

Let say for some reason your program's edge relations were coming in one by one, and you wanted to query the paths at each epoch:

fn main() {
    let mut prog = AscentProgramBuilder::default()
        .build();

    let edges = vec![(1,2), (3,4), (2,4), (2,3)];
    for edge in edges {
        prog.extend_edge_facts(HashSet::from_iter([edge]));
        prog.run_mut();

        let facts = prog.facts();
        println!("{}", facts.path);
    }
}

Again, the API prevents —the user from accidents by only allowing actions that keep the program valid, while also allowing for re-entrancy.

s-arash / ascent