Open kelped opened 7 months ago
I'm in a similar situation to you, having figured out what leapjoins do but not how to use them. For your rules the best I could come up with uses two joins:
let mut iteration = Iteration::new();
let pages: Relation<(u32, &str)> = vec![
(1, "super.com"),
(2, "page.com"),
(3, "subpage.com"),
(4, "minpage.com"),
]
.into();
let edges: Relation<(u32, u32)> = vec![(1, 2), (2, 3), (3, 1), (1, 4)].into();
let pages_source_var: Variable<(u32, &str)> = iteration.variable("pages_source_var");
pages_source_var.insert(pages.clone());
let intermediate_page_edge_var: Variable<(&str, u32)> =
iteration.variable("intermediate_page_edge_var");
let depth_two_var: Variable<(&str, &str)> = iteration.variable("depth_two");
while iteration.changed() {
intermediate_page_edge_var.from_leapjoin(
&pages_source_var,
edges.extend_with(|&(edge_start, _)| edge_start),
|&(_, initial_page_name), &edge_end| (initial_page_name, edge_end),
);
depth_two_var.from_leapjoin(
&intermediate_page_edge_var,
pages.extend_with(|&(_, edge_end)| edge_end),
|&(start_name, _), &end_name| (start_name, end_name),
);
}
println!("{:?}", depth_two_var.complete().elements);
I've thought a bit more about it and here's the explanation I could come up with. Looking at the API for leapjoin
:
fn from_leapjoin(
&self: &Variable<Tuple>,
source: &Variable<SourceTuple>,
leapers: impl Leapers<SourceTuple, Val>,
logic: impl FnMut(&SourceTuple, &Val) -> Tuple
)
&self
is where the results get stored, it represents the left side of the :-
in a rule. I'm going to ignore leapers
for a second, and simplify the rest: from_leapjoin
takes tuples from source: Variable<SourceTuple>
and calls logic: Fn(SourceTuple, ???) -> Tuple
to get tuples to add to self: Variable<Tuple>
.
Without leapers
you could use the function for simple rules, where the left hand side depends only on one variable, e.g. reachable(x, y) :- edge(x, y, cost)
. logic
can just extract x
and y
. But in most cases you'll need to augment that variable's information with other information from the database -- that's where the Val
parameter of logic
comes in.
logic
needs to calculate each entire left-hand-side of the rule using one variable (the source), and Val
. Let's say in your example we pick edges
as the source variable for calculating depthTwo
; then Val
will have to contain both strings W1
and W2
. I'm going to take a look at the leapers that produce Val
.
For every tuple from the source variable, each leaper proposes a set (or restricts an already-proposed set) of Val
s. The intersection of all of these sets is eventually given to logic
, alongside the original source tuple. If we had a leaper proposing as Val
every possible pair of pages (T1, W1)
and (T2, W2)
, then we could use other leapers to restrict that set down to the ones with a given edge between them. Firstly a filter_with
(I think) that would filter out ((T1, W1), (T2, W2))
if the source tuple input isn't an edge starting from T1
, and secondly one that filters those if the source tuple isn't an edge ending in T2
.
depth_two_var.from_leapjoin(
&edges_var,
(
todo!("something proposing all pairs of pages"),
edges.filter_with(|&(start, _)| (start, todo!("not sure what goes here"))),
edges_rev.filter_with(|&(_, end)| (end, todo!("not sure what goes here")))
),
|_src, ((_, start_name), (_, end_name))| (start_name, end_name)
);
The biggest difficulty might be proposing ((T1, W1), (T2, W2))
. The leaper constructors are roughly:
fn function_name(
&self: &Relation<(Key, Val)>,
key_func: Fn(&SourceTuple) -> Key
) -> impl Leaper<SourceTuple, Val>
You can see there's only one input relation, self
, but in order to propose every possible pair of (T1, W1)
and (T2, W2)
we need to access two relations: pages
, and pages
again. (The fact that it's the same relation twice doesn't affect the problem; we need a Cartesian product.) So it turns out that making a leaper proposing all the pairs we need can't be done without doing a second join, a cross join, beforehand, to create the pages × pages
relation.
More succinctly, any data that constructs the tuple on the left-hand-side of the rule has to either come from your chosen source variable, or needs to be from a set enumerated by at least one leaper. If we choose edges
as the source, we need to enumerate every pair of pages.
Anyway, doing a cross join felt inefficient compared to the two leapjoins in my other comment.
Hey, I'm just playing around with datafrog and trying to translate some basic datalog queries to the equivalent Rust code. I am wondering what something like the following snippet looks like in datafrog?
I'm mostly interested in
leapjoin
and have tried something like:But I can't quite wrap my head around how the leapers work even after reading the comment given in the source code of the repo. Any pointers would be appreciated!