Open hurryabit opened 4 years ago
In OOP exhaustive type matching/type casting is achievable with Double Dispatch (aka Visitor Design Pattern). Worth investigating? DD/Visitor is a lot of boilerplate though.
@leo-da We're already using some form of dynamic dispatch for the Kont
type:
https://github.com/digital-asset/daml/blob/067f3c987da11d6292e7cd51baa56f9d5c945528/daml-lf/interpreter/src/main/scala/com/digitalasset/daml/lf/speedy/Speedy.scala#L519
I'm not sure if that's exactly the visitor pattern but it's definitely the same basic idea.
Unfortunately, that might still be too slow because of all the indirections through vtables. Ideally, we would find something based on jump tables like C's switch
or Haskell's case ... of
. But before we dive into the details, we need solid benchmarks.
yep, that is double dispatch: instead of calling Machine
's virtual method directly, Kont
's execute
called first to refine the type of SValue
...
However some execute
implementations are actually doing pattern matching on SValue
... Is that the issue? In any case, you guys know what you are doing :).
We do triple dispatch: on Ctrl
, on Kont
and on Expr
. I eliminated the dispatch on Ctrl
(by making control always be an SExpr
) for 17% speedup (CollectAuthority benchmark)
I think I know how to eliminate the dispatch on Kont
in a similar way. Tomorrow...
To enable us to have only one kind of Kont
we just have to have the effect of entering a continuation always be the same. In essence we will have all continuation be KPushTo
final case class KPushTo(to: util.ArrayList[SValue], next: SExpr) extends Kont {
def execute(v: SValue, machine: Machine) = {
to.add(v)
machine.ctrl = next
}
}
That is: they put the returned value somewhere, then install a new ctrl
ready for the next step()
For the other continuations, we move the desired effect to a new variant of SExpr
, which is used as the next
when the Kont
is created.
We had a discussion about the benefit of loosing the Ctrl
trait and introduction of returnValue
. I re-instated the original behaviour in two steps back on to the speedy-returnValue
branch and got some numbers.
This perf run:
bazel run daml-lf/scenario-interpreter:scenario-perf -- -f 4 -i 10 -wi 10
TL/DR: 111 --> 117 --> 121 ms/op
(1) The branch as it is, with no Ctrl
110.767 ±(99.9%) 2.534 ms/op [Average]
(min, avg, max) = (105.366, 110.767, 121.915), stdev = 4.504
CI (99.9%): [108.233, 113.301] (assumes normal distribution)
(2) Reinstating the Ctrl
trait, and using for the machine controle
The core change is below (with fixups everywhere to the new type)
+ sealed trait Ctrl {
+ def execute(machine: Machine): Unit
+ }
+
+ final case class CtrlExpr(expr: SExpr) extends Ctrl {
+ def execute(machine: Machine) =
+ expr.execute(machine)
+ }
+
+ final case class CtrlValue(value: SValue) extends Ctrl {
+ def execute(machine: Machine): Unit = {
+ machine.popKont().execute(value, machine)
+ }
+ }
+
+
/* The control is what the machine should be evaluating. */
- var ctrl: SExpr,
+ var ctrl: Ctrl,
perf...
116.695 ±(99.9%) 1.981 ms/op [Average]
(min, avg, max) = (107.312, 116.695, 123.405), stdev = 3.521
CI (99.9%): [114.714, 118.676] (assumes normal distribution)
(3) Further reinstating the machine behaviour to always step via a CtrlValue
instead of directly passing the value to the continuation
In essense, this change:
def returnValue(value: SValue): Unit = {
- returnDepth += 1
- if (kontStack.isEmpty) {
- ctrl = CtrlValue(value) // The final resulting value
- } else {
- if (returnDepth < limitReturnDepth) {
- val kont = popKont()
- kont.execute(value, this)
- } else {
- ctrl = CtrlValue(value) // force the current step to end
- }
- }
+ ctrl = CtrlValue(value)
}
perf...
120.739 ±(99.9%) 1.714 ms/op [Average]
(min, avg, max) = (115.624, 120.739, 126.339), stdev = 3.047
CI (99.9%): [119.025, 122.453] (assumes normal distribution)
A CEK machine heavily relies on pattern matching on AST node, continuation stack frames and other data. Unfortunately, pattern matching in Scala is particularly slow. Thus, we need to find alternative ways to discriminate the current AST node in control or the next continuation, ideally a mechanism akin to jump tables.
We should also investigate ways to avoid doing this discrimination whenever possible.
Improvements so far: