Closed SchrodingerZhu closed 6 months ago
Will it be able to figure out the best way to assign reuse token so that the changed fields can be minimized?
Ideally, that would be the case. For now, the Koka compiler just greedily picks a reuse token that has the same size and preferably the same tag as the current allocation. See: https://github.com/koka-lang/koka/blob/master/src/Backend/C/ParcReuse.hs#L280
Will it be able to figure out the best way to assign reuse token so that the changed fields can be minimized?
I think that is a very difficult thing to optimize. For example, we also want to maximize the probability that reuse can succeed in the first place. Take:
match xs
Cons(x, _) ->
val ys = if x > 0 then Cons(x, Nil) else Nil
(x, ys)
Here we have two allocations: One for the Cons
and one for the pair. Both could reuse the Cons
cell of xs
. However, the Cons
allocation will only happen sometimes, while the pair allocation will always happen. As such, we should prefer to reuse at the pair allocation -- but the current Koka compiler prefers the first allocation it encounters, which is the Cons
allocation. Unfortunately, to do all of this optimally seems to be NP-complete, as https://www2.cs.arizona.edu/~debray/Publications/ctgc.pdf show. I think it would be very nice to have a reasonable heuristic for this in Koka, but I am also not sure how much of a performance benefit it would bring.
I am considering whether we can do some super-optimization trick with egg. I cannot say it will be useful in practice, but it can be interesting to express Perceus and Drop-Guided Reuse in rewrite rules:
use egg::{Rewrite, SymbolLang, rewrite as rw, Language, CostFunction, Id, LpCostFunction, EGraph};
fn rewrite_rules() -> Vec<Rewrite<SymbolLang, ()>> {
return vec![
// (free ?variable ?layout ?continuation)
// (drop ?variable ?layout ?continuation)
// (dup ?variable ?continuation)
// (construct ?constructor ?layout)
// (construct-at ?variable ?constructor ?layout)
rw!("drop-drop-exchange"; "(drop ?a ?la (drop ?b ?lb ?c))" => "(drop ?b ?lb (drop ?a ?la ?c))"),
rw!("drop-dup-exchange"; "(drop ?a ?la (dup ?b ?c))" => "(dup ?b (drop ?a ?la ?c))"),
rw!("drop-free-exchange"; "(drop ?a ?la (free ?b ?lb ?c))" => "(free ?b ?lb (drop ?a ?la ?c))"),
rw!("drop-dup-cancel"; "(drop ?x ?lx (dup ?x ?y))" => "?y"),
rw!("drop-reuse-rewrite"; "(drop ?x ?lx ?y)" => "(let ?x (drop-for-reuse ?x) (free ?x ?lx ?y))"),
rw!("drop-let-binder-permeate"; "(drop ?a ?la (let ?b ?c ?d))" => "(let ?b (drop ?a ?la ?c) ?d)"),
rw!("drop-let-body-permeate"; "(drop ?a ?la (let ?b ?c ?d))" => "(let ?b ?c (drop ?a ?la ?d))"),
rw!("drop-condition-permeate"; "(drop ?a ?la (if ?c ?d ?e))" => "(if (drop ?a ?la ?c) ?d ?e)"),
rw!("drop-branch-permeate"; "(drop ?a ?la (if ?c ?d ?e))" => "(if ?c (drop ?a ?la ?d) (drop ?a ?la ?e))"),
rw!("dup-dup-exchange"; "(dup ?a (dup ?b ?c))" => "(dup ?b (dup ?a ?c))"),
rw!("dup-drop-exchange"; "(dup ?a (drop ?b ?lb ?c))" => "(drop ?b ?lb (dup ?a ?c))"),
rw!("dup-free-exchange"; "(dup ?a (free ?b ?lb ?c))" => "(free ?b ?lb (dup ?a ?c))"),
rw!("dup-drop-cancel"; "(dup ?x (drop ?x ?lx ?y))" => "?y"),
rw!("dup-let-binder-permeate"; "(dup ?a (let ?b ?c ?d))" => "(let ?b (dup ?a ?c) ?d)"),
rw!("dup-let-body-permeate"; "(dup ?a (let ?b ?c ?d))" => "(let ?b ?c (dup ?a ?d))"),
rw!("dup-condition-permeate"; "(dup ?a (if ?c ?d ?e))" => "(if (dup ?a ?c) ?d ?e)"),
rw!("dup-branch-permeate"; "(dup ?a (if ?c ?d ?e))" => "(if ?c (dup ?a ?d) (dup ?a ?e))"),
rw!("free-free-exchange"; "(free ?a ?la (free ?b ?lb ?c))" => "(free ?b ?lb (free ?a ?la ?c))"),
rw!("free-dup-exchange"; "(free ?a ?la (dup ?b ?c))" => "(dup ?b (free ?a ?la ?c))"),
rw!("free-drop-exchange"; "(free ?a ?la (drop ?b ?lb ?c))" => "(drop ?b ?lb (free ?a ?la ?c))"),
rw!("free-let-binder-permeate"; "(free ?a ?la (let ?b ?c ?d))" => "(let ?b (free ?a ?la ?c) ?d)"),
rw!("free-let-body-permeate"; "(free ?a ?la (let ?b ?c ?d))" => "(let ?b ?c (free ?a ?la ?d))"),
rw!("free-condition-permeate"; "(free ?a ?la (if ?c ?d ?e))" => "(if (free ?a ?la ?c) ?d ?e)"),
rw!("free-branch-permeate"; "(free ?a ?la (if ?c ?d ?e))" => "(if ?c (free ?a ?la ?d) (free ?a ?la ?e))"),
rw!("free-construct-rewrite"; "(free ?var ?layout (construct ?constructor ?layout))" => "(construct-at ?var ?constructor ?layout)")
];
}
pub const CONSTRUCT_WEIGHT : f64 = 64.0;
pub const FREE_WEIGHT : f64 = 56.0;
pub const CONSTRUCT_AT_WEIGHT : f64 = 24.0;
pub const DROP_WEIGHT : f64 = 32.0;
pub const DROP_REUSE_WEIGHT : f64 = 8.0;
pub const DUP_WEIGHT : f64 = 4.0;
#[derive(Debug)]
pub struct Heuristic;
impl LpCostFunction<SymbolLang, ()> for Heuristic {
fn node_cost(&mut self, egraph: &EGraph<SymbolLang, ()>, eclass: Id, enode: &SymbolLang) -> f64
{
match enode.op.as_str() {
"free" => FREE_WEIGHT,
"drop" => DROP_WEIGHT,
"drop-for-reuse" => DROP_REUSE_WEIGHT,
"dup" => DUP_WEIGHT,
"construct" => CONSTRUCT_WEIGHT,
"construct-at" => CONSTRUCT_AT_WEIGHT,
_ => 2.0
}
}
}
#[cfg(test)]
mod test {
use egg::{AstSize, Extractor, LpExtractor, Runner};
#[test]
fn cfg() {
let start = r#"
(let x (get-argument)
(if (is-cons-cell x)
(dup x.t
(dup x.h
(drop x.h (layout 8 8) (drop x.t (layout 16 8) (drop x (layout 16 8) (construct "Cons" (layout 16 8)))))))
(drop x (layout 16 8) (struct (construct "Nil" (layout 16 8))))
)
)"#.parse().unwrap();
let runner = Runner::default().with_expr(&start).run(&super::rewrite_rules());
let lp_best = LpExtractor::new(&runner.egraph, super::Heuristic)
.timeout(20.0)
.solve(runner.roots[0]);
println!("{}", lp_best.pretty(64));
}
}
This gives a program like:
(let
x
get-argument
(if
(is-cons-cell x)
(let x (drop-for-reuse x) (construct-at x Cons (layout 16 8)))
(drop x (layout 16 8) (struct (construct Nil (layout 16 8))))))
Wait, in the dup-drop-exchange
rule there is no constraint on a
and b
right? Is that rule correct?
However, I like this approach in general. I think the general idea of "pushing down" drops and "pushing up" dups is quite helpful in thinking about optimizing this. In particular, the free-construct-rewrite
rule is exactly how we model reuse analysis formally (see page 16 of https://www.microsoft.com/en-us/research/uploads/prod/2021/11/flreuse-tr.pdf).
That may cause some problem, I am only using the default symbol rewrite rule here without considering more complicated cases.
Yeah, I get the idea of free-construct-rewrite
exactly from the drop-guided reuse example.
Just to add up: To reproduce something in the original paper, one can add more rules:
// here we use layout to distinguish type; one can should use more complicated type info instead
rw!("drop-reuse-rewrite"; "(drop ?x ?lx ?y)" => "(free (drop-for-reuse ?x ?lx) ?lx ?y)"), // change the rewrite rule for drop-reuse
rw!("drop-free-merge"; "(drop ?a ?la (free ?b ?lb ?c))" => "(free (drop ?a ?la ?b) ?lb ?c)"),
rw!("drop-2-cell-expansion"; "(drop ?a (layout 16 8) ?c)" => "(if (is-unique ?a) (drop (head ?a) (layout 8 8) (drop (tail ?a) (layout 8 8) (free ?a (layout 16 8) ?c))) (decref ?a ?c))"),
rw!("reuse-2-cell-expansion"; "(drop-for-reuse ?a (layout 16 8))" => "(if (is-unique ?a) (drop (head ?a) (layout 8 8) (drop (tail ?a) (layout 8 8) ?a)) (decref ?a null))"),
This will transform sth like:
(let x (get-argument)
(if (is-pair x)
(dup (head x)
(dup (tail x)
(drop x (layout 16 8) (construct "Pair" (layout 16 8)))))
(drop x (layout 16 8) (struct (construct "SomethingElse" (layout 16 8))))
)
)
into
(let
x
get-argument
(if
(is-pair x)
(if
(is-unique x)
(construct-at x Pair (layout 16 8))
(dup
(tail x)
(dup (head x) (decref x (construct Pair (layout 16 8))))))
(drop
x
(layout 16 8)
(struct (construct SomethingElse (layout 16 8))))))
Again, I am just to share the ideas and questions during my studying your great papers. Thanks very much for all these follow-ups.
Should we move this to a discussion topic, or is there something actionable for Koka here? @anfelor you definitely know more about this aspect of Koka. Maybe we can make smaller issues for definite optimization opportunities.
Yeah, this could be moved to discussions or closed since there is no performance issue here. I have since come up with a slightly better reuse matching algorithm that I hope to add to Koka in the future, but I expect the improvement would be quite minor.
In the reuse analysis, if within a match arm (imaging balancing a redblack tree where we have multiple level patterns) contains a lot of reusable memory blocks with same layout, what will the koka compiler do? Will it be able to figure out the best way to assign reuse token so that the changed fields can be minimized? (I am not sure if it is optimal condition since there are side effects of dup/drop operations)