Closed TimWhiting closed 6 months ago
Okay, I think that I have a better benchmark setup which actually shows improvement, I'm going to revisit the other benchmarks.
effect one_e
fun one(i: int): int
effect two_e
ctl two(): bool
fun test1(i: int): <two_e,one_e> int
val x = one(i)
val y = two()
if y then
x + one(i - 1)
else
x + one(i)
fun test(acc: int): <one_e> int
with handler
ctl two()
val x: int = resume(True)
val y: int = resume(False)
(x + y) / (x / 2)
test1(acc)
fun main(): <div,console> ()
var acc := 1
var iter := 0
with handler
fun one(i: int) i + acc
while { iter < 5000000 }
acc := test(acc)
iter := iter + 1
acc.println
Results: (effect is the program directly above, effect_1 and effect_2 are the programs from the original post)
Normalized relative to the program without linear annotation.
Unoptimized:
--- effects ----------------
effects, nonlinear, 1.00254s ~0.017, 2986kb, size: 5877kb
effects, linear, 0.92018s ~0.016, 2954kb, size: 5679kb
--- effects_1 ----------------
effects_1, nonlinear_1, 0.11573s ~0.002, 67000kb, size: 5888kb
effects_1, linear_1, 0.11654s ~0.002, 67000kb, size: 5813kb
--- effects_2 ----------------
effects_2, nonlinear_2, 1.44056s ~0.024, 440000kb, size: 5940kb
effects_2, linear_2, 1.57968s ~0.012, 612000kb, size: 5870kb
--- normalized effects ----------------
effects, nonlinear, 1.00000x ~0.017, 1.00x, size: 1.00000x
effects, linear, 0.91784x ~0.016, 0.99x, size: 0.96636x
--- normalized effects_1 ----------------
effects_1, nonlinear_1, 1.00000x ~0.002, 1.00x, size: 1.00000x
effects_1, linear_1, 1.00696x ~0.002, 1.00x, size: 0.98733x
--- normalized effects_2 ----------------
effects_2, nonlinear_2, 1.00000x ~0.024, 1.00x, size: 1.00000x
effects_2, linear_2, 1.09658x ~0.012, 1.39x, size: 0.98827x
Optimized with -O3:
--- effects ----------------
effects, nonlinear, 0.92644s ~0.007, 3002kb, size: 6101kb
effects, linear, 0.90767s ~0.008, 2984kb, size: 5727kb
--- effects_1 ----------------
effects_1, nonlinear_1, 0.11369s ~0.001, 67000kb, size: 6022kb
effects_1, linear_1, 0.11392s ~0.001, 67000kb, size: 5928kb
--- effects_2 ----------------
effects_2, nonlinear_2, 1.37114s ~0.022, 440000kb, size: 6163kb
effects_2, linear_2, 1.50733s ~0.011, 612000kb, size: 6014kb
--- normalized effects ----------------
effects, nonlinear, 1.00000x ~0.007, 1.00x, size: 1.00000x
effects, linear, 0.97974x ~0.008, 0.99x, size: 0.93879x
--- normalized effects_1 ----------------
effects_1, nonlinear_1, 1.00000x ~0.001, 1.00x, size: 1.00000x
effects_1, linear_1, 1.00199x ~0.001, 1.00x, size: 0.98427x
--- normalized effects_2 ----------------
effects_2, nonlinear_2, 1.00000x ~0.022, 1.00x, size: 1.00000x
effects_2, linear_2, 1.09933x ~0.011, 1.39x, size: 0.97571x
A version of the second one which is actually polymorphic:
effect inc
ctl inc(): int
linear effect one_e<a>
fun one(start: a): a
fun list(i: int): div list<int>
if i == 0 then
[]
else
Cons(i, list(i - 1))
fun test1(i: int, start: a, other: a): <div,inc,one_e<a>> list<a>
if i == 0 then
[one(start)]
else
list(i).map fn(_)
val x = one(start)
if inc() % 2 == 0 then one(x) else one(x)
fun main(): <div,console> ()
val res = ref("")
val a = fn()
var x:= 0
with handler
fun one(a:string) a ++ "one"
with handler
ctl inc()
resume(x)
resume(x + 1)
x := x + 1
res := !res ++ test1(15, "Hello ", " World").join("").count.show
a()
(!res ++ "\ndone").println()
Debug
--- effects_2_poly ----------------
effects_2_poly, nonlinear_2_poly, 0.02961s ~0.001, 4618kb, size: 5786kb
effects_2_poly, linear_2_poly, 0.03010s ~0.000, 4636kb, size: 5728kb
--- normalized effects_2_poly ----------------
effects_2_poly, nonlinear_2_poly, 1.00000x ~0.001, 1.00x, size: 1.00000x
effects_2_poly, linear_2_poly, 1.01658x ~0.000, 1.00x, size: 0.98996x
Optimized:
--- effects_2_poly ----------------
effects_2_poly, nonlinear_2_poly, 0.03043s ~0.001, 4634kb, size: 5978kb
effects_2_poly, linear_2_poly, 0.03050s ~0.001, 4632kb, size: 5907kb
--- normalized effects_2_poly ----------------
effects_2_poly, nonlinear_2_poly, 1.00000x ~0.001, 1.00x, size: 1.00000x
effects_2_poly, linear_2_poly, 1.00222x ~0.001, 1.00x, size: 0.98821x
Changing the effects program to include more non-tail call operations causes a much clearer improvement.
linear effect one_e
fun one(i: int): int
effect two_e
ctl two(): bool
fun test1(i: int): <two_e,one_e> int
val x = one(i) + i + 2 + one(i + 1) + 4 + one(i + 4)
val y = two()
if y then
x + one(i - 1) + (x/2) + 2 + one(i + 1) + 4 + one(i / 2)
else
x + one(i) + x + 2 + one(i + 1) + 4 + one(i + 4)
fun test(acc: int): <one_e> int
with handler
ctl two()
val x: int = resume(True)
val y: int = resume(False)
(x + y) / (x / 2)
test1(acc)
fun main(): <div,console> ()
var acc := 1
var iter := 0
with handler
fun one(i: int) i + acc
while { iter < 5000000 }
acc := test(acc)
iter := iter + 1
acc.println
--- effects ----------------
effects, nonlinear, 1.18049s ~0.008, 3020kb, size: 6503kb
effects, linear, 1.11042s ~0.007, 2992kb, size: 5994kb
--- effects_optimized ----------------
effects_optimized, nonlinear_optimized, 1.15140s ~0.032, 2992kb, size: 6503kb
effects_optimized, linear_optimized, 1.11520s ~0.024, 3000kb, size: 5994kb
--- normalized effects ----------------
effects, nonlinear, 1.00000x ~0.008, 1.00x, size: 1.00000x
effects, linear, 0.94065x ~0.007, 0.99x, size: 0.92174x
--- normalized effects_optimized ----------------
effects_optimized, nonlinear_optimized, 1.00000x ~0.032, 1.00x, size: 1.00000x
effects_optimized, linear_optimized, 0.96856x ~0.024, 1.00x, size: 0.92174x
Can anyone point me to a particular pattern that would cause the performance to really change by adding a linear annotation? I think the monadic translation doesn't actually incur much overhead, and the information passed in the evidence passing about whether the function is tail resumptive is actually the thing that makes a big difference.
It seems that the linear annotation does not actually improve performance in general in my tests. I could be misunderstanding what kinds of patterns could cause the difference in performance to be noticeable though.
Using the
ctl
vsfun
keyword does improve performance a lot.I definitely see a reduction in code generated for the linear annotation, but I've tried several different code patterns to see how much the linear annotation actually improves execution time, and there seems to be no difference.
I've tried alternating nesting calls to a (linear) effect and an effect with a nondeterministic ctl operator, using the linear effect in the polymorphic map function (polymorphism is explicitly mentioned in the documentation as a place where the linear annotation would improve performance). I've also tried a variety of other code patterns, and haven't seen a difference.
Here is some sample code variations that I have tried to craft to see if there was actually a performance benefit. (Changing one_e from effect to linear effect seems to make no difference). I've also tried larger programs.