Clarification about linear effect performance

TimWhiting commented 1 year ago

Can anyone point me to a particular pattern that would cause the performance to really change by adding a linear annotation? I think the monadic translation doesn't actually incur much overhead, and the information passed in the evidence passing about whether the function is tail resumptive is actually the thing that makes a big difference.

It seems that the linear annotation does not actually improve performance in general in my tests. I could be misunderstanding what kinds of patterns could cause the difference in performance to be noticeable though.

Using the ctl vs fun keyword does improve performance a lot.

I definitely see a reduction in code generated for the linear annotation, but I've tried several different code patterns to see how much the linear annotation actually improves execution time, and there seems to be no difference.

I've tried alternating nesting calls to a (linear) effect and an effect with a nondeterministic ctl operator, using the linear effect in the polymorphic map function (polymorphism is explicitly mentioned in the documentation as a place where the linear annotation would improve performance). I've also tried a variety of other code patterns, and haven't seen a difference.

Here is some sample code variations that I have tried to craft to see if there was actually a performance benefit. (Changing one_e from effect to linear effect seems to make no difference). I've also tried larger programs.

effect one_e
  fun one(): string

effect two_e
  ctl two(): bool

fun list(i: int): div list<int>
  if i == 0 then
    []
  else
    Cons(i, list(i - 1))

fun test1(i: int): <div,two_e,one_e> string
  val x = list(i).map(fn(x) test1(x - 1)).join("")
  "Hello" ++ x ++ one() ++ (if two() then "World" ++ one() else one() ++ "World2")

fun main(): <div,console> ()
  with handler
    fun one() "one"
  val x = handle({ test1(4)  ++ "done"})
    ctl two() one() ++ resume(True) ++ one() ++ resume(False)
    return(x) x
  x.count.println()

effect inc
  ctl inc(): int

effect one_e
  fun one(): string

fun list(i: int): div list<int>
  if i == 0 then
    []
  else
    Cons(i, list(i - 1))

fun test1(i: int): <div,inc,one_e> string
  if i == 0 then
    one()
  else
    val x = list(i).map(fn(_) one() ++ (if inc() % 2 == 0 then one() ++ " " else " " ++ one())).join("")
    "Hello" ++ one() ++ test1(i - 1)

fun main(): <div,console> ()
  val res = ref("")
  val a = fn()
    var x:= 0
    with handler
      fun one() "one"
    with handler
      ctl inc() 
        resume(x)
        resume(x + 1)
        x := x + 1
    res := !res ++ test1(6).count.show
  a()
  (!res ++ "\ndone").println()

TimWhiting commented 1 year ago

Okay, I think that I have a better benchmark setup which actually shows improvement, I'm going to revisit the other benchmarks.

effect one_e
  fun one(i: int): int

effect two_e
  ctl two(): bool

fun test1(i: int): <two_e,one_e> int
  val x = one(i)
  val y = two()
  if y then
    x + one(i - 1)
  else
    x + one(i)

fun test(acc: int): <one_e> int
  with handler
    ctl two() 
      val x: int = resume(True)
      val y: int = resume(False)
      (x + y) / (x / 2)
  test1(acc)

fun main(): <div,console> ()
  var acc := 1
  var iter := 0
  with handler
    fun one(i: int) i + acc
  while { iter < 5000000 } 
    acc := test(acc)
    iter := iter + 1
  acc.println

TimWhiting commented 1 year ago

Results: (effect is the program directly above, effect_1 and effect_2 are the programs from the original post)

Normalized relative to the program without linear annotation.

Unoptimized:

--- effects ----------------
effects, nonlinear, 1.00254s ~0.017, 2986kb, size: 5877kb
effects, linear, 0.92018s ~0.016, 2954kb, size: 5679kb

--- effects_1 ----------------
effects_1, nonlinear_1, 0.11573s ~0.002, 67000kb, size: 5888kb
effects_1, linear_1, 0.11654s ~0.002, 67000kb, size: 5813kb

--- effects_2 ----------------
effects_2, nonlinear_2, 1.44056s ~0.024, 440000kb, size: 5940kb
effects_2, linear_2, 1.57968s ~0.012, 612000kb, size: 5870kb

--- normalized effects ----------------
effects, nonlinear, 1.00000x ~0.017, 1.00x, size: 1.00000x
effects, linear, 0.91784x ~0.016, 0.99x, size: 0.96636x

--- normalized effects_1 ----------------
effects_1, nonlinear_1, 1.00000x ~0.002, 1.00x, size: 1.00000x
effects_1, linear_1, 1.00696x ~0.002, 1.00x, size: 0.98733x

--- normalized effects_2 ----------------
effects_2, nonlinear_2, 1.00000x ~0.024, 1.00x, size: 1.00000x
effects_2, linear_2, 1.09658x ~0.012, 1.39x, size: 0.98827x

Optimized with -O3:

--- effects ----------------
effects, nonlinear, 0.92644s ~0.007, 3002kb, size: 6101kb
effects, linear, 0.90767s ~0.008, 2984kb, size: 5727kb

--- effects_1 ----------------
effects_1, nonlinear_1, 0.11369s ~0.001, 67000kb, size: 6022kb
effects_1, linear_1, 0.11392s ~0.001, 67000kb, size: 5928kb

--- effects_2 ----------------
effects_2, nonlinear_2, 1.37114s ~0.022, 440000kb, size: 6163kb
effects_2, linear_2, 1.50733s ~0.011, 612000kb, size: 6014kb

--- normalized effects ----------------
effects, nonlinear, 1.00000x ~0.007, 1.00x, size: 1.00000x
effects, linear, 0.97974x ~0.008, 0.99x, size: 0.93879x

--- normalized effects_1 ----------------
effects_1, nonlinear_1, 1.00000x ~0.001, 1.00x, size: 1.00000x
effects_1, linear_1, 1.00199x ~0.001, 1.00x, size: 0.98427x

--- normalized effects_2 ----------------
effects_2, nonlinear_2, 1.00000x ~0.022, 1.00x, size: 1.00000x
effects_2, linear_2, 1.09933x ~0.011, 1.39x, size: 0.97571x

TimWhiting commented 1 year ago

A version of the second one which is actually polymorphic:

effect inc
  ctl inc(): int

linear effect one_e<a>
  fun one(start: a): a

fun list(i: int): div list<int>
  if i == 0 then
    []
  else
    Cons(i, list(i - 1))

fun test1(i: int, start: a, other: a): <div,inc,one_e<a>> list<a>
  if i == 0 then
    [one(start)]
  else
    list(i).map fn(_) 
      val x = one(start) 
      if inc() % 2 == 0 then one(x) else one(x)

fun main(): <div,console> ()
  val res = ref("")
  val a = fn()
    var x:= 0
    with handler
      fun one(a:string) a ++ "one"
    with handler
      ctl inc() 
        resume(x)
        resume(x + 1)
        x := x + 1
    res := !res ++ test1(15, "Hello ", " World").join("").count.show
  a()
  (!res ++ "\ndone").println()

Debug

--- effects_2_poly ----------------
effects_2_poly, nonlinear_2_poly, 0.02961s ~0.001, 4618kb, size: 5786kb
effects_2_poly, linear_2_poly, 0.03010s ~0.000, 4636kb, size: 5728kb

--- normalized effects_2_poly ----------------
effects_2_poly, nonlinear_2_poly, 1.00000x ~0.001, 1.00x, size: 1.00000x
effects_2_poly, linear_2_poly, 1.01658x ~0.000, 1.00x, size: 0.98996x

Optimized:

--- effects_2_poly ----------------
effects_2_poly, nonlinear_2_poly, 0.03043s ~0.001, 4634kb, size: 5978kb
effects_2_poly, linear_2_poly, 0.03050s ~0.001, 4632kb, size: 5907kb

--- normalized effects_2_poly ----------------
effects_2_poly, nonlinear_2_poly, 1.00000x ~0.001, 1.00x, size: 1.00000x
effects_2_poly, linear_2_poly, 1.00222x ~0.001, 1.00x, size: 0.98821x

TimWhiting commented 1 year ago

Changing the effects program to include more non-tail call operations causes a much clearer improvement.

linear effect one_e
  fun one(i: int): int

effect two_e
  ctl two(): bool

fun test1(i: int): <two_e,one_e> int
  val x = one(i) + i + 2 + one(i + 1) + 4 + one(i + 4)
  val y = two()
  if y then
    x + one(i - 1) + (x/2) + 2 + one(i + 1) + 4 + one(i / 2)
  else
    x + one(i) + x + 2 + one(i + 1) + 4 + one(i + 4)

fun test(acc: int): <one_e> int
  with handler
    ctl two() 
      val x: int = resume(True)
      val y: int = resume(False)
      (x + y) / (x / 2)
  test1(acc)

fun main(): <div,console> ()
  var acc := 1
  var iter := 0
  with handler
    fun one(i: int) i + acc
  while { iter < 5000000 }
    acc := test(acc)
    iter := iter + 1
  acc.println

--- effects ----------------
effects, nonlinear, 1.18049s ~0.008, 3020kb, size: 6503kb
effects, linear, 1.11042s ~0.007, 2992kb, size: 5994kb

--- effects_optimized ----------------
effects_optimized, nonlinear_optimized, 1.15140s ~0.032, 2992kb, size: 6503kb
effects_optimized, linear_optimized, 1.11520s ~0.024, 3000kb, size: 5994kb

--- normalized effects ----------------
effects, nonlinear, 1.00000x ~0.008, 1.00x, size: 1.00000x
effects, linear, 0.94065x ~0.007, 0.99x, size: 0.92174x

--- normalized effects_optimized ----------------
effects_optimized, nonlinear_optimized, 1.00000x ~0.032, 1.00x, size: 1.00000x
effects_optimized, linear_optimized, 0.96856x ~0.024, 1.00x, size: 0.92174x

koka-lang / koka

Clarification about linear effect performance #326