Open gelisam opened 11 months ago
Here is a corner case.
$apply1 = func($f, $x) {
$f($x)
}
$apply2 = func($f, $x) {
$f($f($x))
}
$apply = if ($now % 2 == 0) then $apply1 else $apply2
$id = func($y) { $y }
test $apply($id, "foo")
Every second, $apply
will switch from one constant expression to another. Because $apply
changes, the $apply($id, "foo")
call needs a dynamic sub-graph. With the current implementation, each of those sub-graphs will in turn use CallFunc
s for the $f($x)
calls, thus resulting in sub-sub-graphs. Since $id
is statically-known, it should theoretically be possible to for each instantiation of the sub-graph to use the optimization, that is, to skip the CallFunc
s and avoid the need for sub-sub-graphs. Unfortunately, since Value()
uses the scope and does not have Graph()
's sctx
argument, it is not possible to determine what $f
refers to, and thus Value()
must return nil
.
Is it worth to add the sctx
argument in order to address this case, or is this too small of a corner case?
Is it worth to add the sctx argument in order to address this case, or is this too small of a corner case?
I do not know, but let's please definitely add this as a test =D
If for weird examples like this we have a dynamic graph, it is okay. What's the sctx
argument??
Another corner case:
$f1 = func($x) { $x }
$f2 = func($x) { $x + $x }
$id = func($y) { $y }
$f = if $id(true) then $f1 else $f2
test $f("foo")
No higher-order functions this time. In order for $id(true)
to be statically-known, we need to:
full.FuncValue
for $id
Value
for true
full.FuncValue
on the input Value
in order to obtain an output Value
The hard step is (3), because the type
func (obj *full.FuncValue) Call(txn Txn, args []Func) (Func, error)
is not what we need. We need
func (obj *full.FuncValue) SpeculativeCall(args []Value) (Value, error)
is it worth implementing SpeculativeCall
in order to support this use case?
is it worth implementing SpeculativeCall in order to support this use case?
This case (or something that looks like it) seems more plausible, but still in the 20%.
I think our goal for step one
should be to initially make 80% of common mcl code use the simple graph shape, as long as we can extend our optimizations out as a step two
without re-writing all our plumbing for step one
.
Let's add this as a test btw =D
What's the
sctx
argument??
SetScope
's second argument:
func (obj *Expr) SetScope(scope *Scope, sctx map[string]Expr) error
in the mcl-functions
branch I named it context
, I don't know why you renamed it to sctx
. In
$f = func($x) {
func($y) {
...
}
}
it maps $x
and $y
to different ExprParam
s.
Yet another corner case. Lambdas are always constant, but in
$add = func($x) {
func($y) {$x + $y}
}
test $add("foo")("bar")
the nested FuncExpr.Value()
will not be able to produce a full.FuncValue
representing func($y) {$x + $y}
, because once again .Value()
does not receive an sctx
.
Actually, this might not be a corner case at all... The FuncExpr
doesn't know whether it is a nested lambda, so it doesn't know whether the variables in its body are supposed to refer to top-level definitions or to parameters from surrounding lambdas. Thus, it is never safe for FuncExpr.Value()
to return a FuncValue
! If FuncExpr.Value()
tried to return a FuncValue
in the case in which all the variables of its body are in obj.scope
, then
$x = "not this x"
$add = func($x) {
func($y) {$x + $y}
}
test $add("foo")("bar")
would FuncExpr.Value()
would incorrectly generate a FuncValue
corresponding to func($y) {$x + $y}
where $x
refers to "not this x"
instead of referring to the outer lambda's parameter.
It thus looks like Expr.Value()
should take an sctx
parameter. Does it still make sense to use the same Expr.Value()
for obtaining the Value
to give to a resource, or would it make more sense to split Expr.Value()
into Expr.ValueForResource()
and Expr.SpeculativeValue(sctx)
?
Does it still make sense to use the same Expr.Value() for obtaining the Value to give to a resource, or would it make more sense to split Expr.Value() into Expr.ValueForResource() and Expr.SpeculativeValue(sctx)?
Hmmm...
Is it still possible to do a step one
where we don't do any sctx
stuff, just to merge it and have some easy optimizations done, and grow our confidence in the optimization for those easy things?
The advantage of doing this in two steps are:
1) There's a bigger chance I more fully understand what's going on. 2) There's a chance we have a new realization that precludes needing the sctx thing (maybe unlikely)
The down side is we have extra commits in master, but that just means more points! \o/
It will also give us a chance to briefly verify if the signature of SetScope:
SetScope(scope *interfaces.Scope, sctx map[string]interfaces.Expr)
Still makes sense or if this should be a single combined arg or something different.
Actually a good question is: Are you currently happy/okay with the design of SetScope or is it broken in your mind as a stage?
right, right, you want me to make these kinds of decisions on my own... I say we keep a single Expr.Value(sctx)
.
Except we make this speculative call from Graph(env)
, not SetScope
, so it needs to be an env: []Func
, not an sctx: []Expr
.
right, right, you want me to make these kinds of decisions on my own
You're in charge, I'm meant to be here to complain if my spider sense goes off, or if I feel you're writing pure/functional code instead of elegant, imperative, faster golang code ;)
Is it still possible to do a step one where we don't do any sctx stuff, just to merge it and have some easy optimizations done, and grow our confidence in the optimization for those easy things?
Yes, if we give up on optimizing inside lambdas. ExprCall.Graph()
will have to check if sctx
is non-empty, and if so, assume that calling Value()
on the function would give the wrong result, and thus use dynamic sub-graphs.
if so, assume that calling Value() on the function would give the wrong result
So currently we use Value() here for printf... Does this mean that this is currently incorrect? Would we have to plumb through the sctx parameter? (This call happens during unification to inspect the format string.)
I'd try this:
$s = "%d"
$f = func($s) {
print($s, 42)
}
test $f("%s") {}
I suspect that since Value doesn't have an sctx parameter, when printf calls Value()
on the $s
, it might see the top-level $s = "%d"
and incorrectly accept this program instead of failing with a type error which says that we don't know the value of the format string and thus can't check if 42 has the right type.
I'd try this:
Indeed there's an issue! Actually I found a sort-of related bug. Hmmm...
Well here's the interesting point: plumbing through sctx would be very complicated... Which leads me to think that the more appropriate solution is one in which sctx is "captured" within the Expr AST node, so that we don't need to plumb it through the API...
That would not work because the same Expr is used for several lambda calls. You can only store one sctx in the Expr itself, but you can pass a different sctx each time Graph() or Value() is called.
Yes, but if we have to have an sctx inside of Unify, that means we need to plumb this into the unification engine...
I feel like the right approach might involve an Initial Expr* whatever ast node, with it's own sctx, but then when it's used we copy it and have a new sctx contained within for each different Expr?
Or something else?
Remember, we used to do that, but we had to change course in order to support map
! Before mcl-functions, each ExprCall
copied the body of the lambda it called, and then SetScope
was able to make each $x
use site inside a lambda func($x) {...}
point to the argument $f(arg)
of the corresponding call site. We did all that before type checking.
We can't do that with map
, because the number of call sites is not known until runtime, after type checking. So we had to switch to a different implementation strategy in which a full.FuncValue
generates a sub-graphs. Thanks to those sub-graphs, it is possible to change the shape of the function graph at runtime, so that we end up with one CallFunc
for each call site even though the number of call sites is not known until runtime.
But we only create more Funcs, we don't create more Exprs! And we don't need to: lambdas are monomorphic, so the $x
is func($x) {...}
has only one type, so it only needs one copy.
From our phone conversation:
https://github.com/purpleidea/mgmt/blob/6a6546db8d3d481cfaf5adcb71c2329d1284cbd3/lang/ast/structs.go#L8802 https://github.com/purpleidea/mgmt/blob/6a6546db8d3d481cfaf5adcb71c2329d1284cbd3/lang/ast/structs.go#L8685
Both of these ExprParam and ExprPoly return nil, nil. Which we probably don't want.
This might come from here: https://github.com/purpleidea/mgmt/blob/6a6546db8d3d481cfaf5adcb71c2329d1284cbd3/lang/ast/structs.go#L8539
(I will add a patch to add that test in and add errors on those two nils.)
From our phone conversation
More importantly for this MR, since ExprVar.SetScope()
does
obj.scope.Variables[obj.Name] = sctx[obj.Name]
There is no need to add sctx
to Value()
after all.
Here's the patch (just testing)
https://github.com/purpleidea/mgmt/commit/a23312a913cb5263a2cbb90b2c4c3403276b2886
Sort of related:
import "fmt"
$format = "%d" # should get ignored
$fn = func($format) {
fmt.printf($format, 42)
}
test $fn("%s") {}
# should error at unification if possible, otherwise at runtime
So btw for this test, with the dynamic printf enabled ( PrintfAllowNonStaticFormat = true
) this fails at runtime, and with it off, it fails during type unification. I guess that's fine for now. Of course if all the format strings are "%d", then this still can't unify with PrintfAllowNonStaticFormat = false
. Which is too bad, but seems fine for now.
Why do I mention any of this? Because speculatively, I would expect this func to be run at compile time...
(Merged in master.)
I would expect this func to be run at compile time
Wait, do you want to evaluate statically-known expressions at compile-time or to generate a graph with no sub-graphs when possible? It's not the same thing!
Huh, the code on this branch doesn't even compile? 7b5466e fails with
lang/funcs/facts/func.go:90:18: obj.Fact.Call undefined (type Fact has no field or method Call)
Instead of starting from a broken #716, I'll start from master. I'll cherry-pick from #716 later if that turns out to be useful for this MR.
I'm confused, why does this say that I have closed the PR??
Oh! I have reverted gelisam/graph-shape
to point to master
, so now it looks like master
contains all the commits from gelisam/graph-shape
and thus that the branch has been merged! Well played GitHub, this is a clever way to avoid having to store one boolean per PR indicating whether it has been merged or not.
lang/funcs/facts/func.go:90:18: obj.Fact.Call undefined (type Fact has no field or method Call)
Oh, hm weird. Maybe I forgot to include something in my commit? if you need it LMK and I'll make sure it's working.
I have created tests based on the counter-examples in this thread, let's see if GitHub will allow me to reopen the PR...
I have now made sure the new tests pass on CI, not just on my machine, before I start modifying the code.
Did we decide whether we would implement FakeTxn or if we would change Graph() to receive a Txn instead of returning a Graph?
Did we decide whether we would implement FakeTxn or if we would change Graph() to receive a Txn instead of returning a Graph?
We didn't decide, and I'm also undecided. If you feel strongly about one or the other, let's do that. I can't visualize all the implications yet.
I can't visualize all the implications yet.
Which really means, we should go with whatever you think is best, and hopefully we don't discover some reason that makes us need to change our minds after of course =D
Last time we met, I was confused about Expr.Graph()
's env
parameter. It represents the subset of variables which are currently in scope which come from a function parameter. I was worried that the a call site's env
might be passed to the implementation, which would be bad if there is an $x
which is in scope at both places, but it's not the same $x
. Like this:
$make_exclaimer = func($x) {
func() {$x + "!"} # at the definition site, $x is "foo"
}
$use_exclaimer = func($f, $x) {
$f() + $x # at the call site, $x is "bar"
}
test $use_exclaimer($make_exclaimer("foo"), "bar") {}
Luckily, I wrote some comments in ExprFunc.Graph()
and ExprCall.Graph()
which clarify how I am handling the situation.
There are actually two separate cases. In the easy case, the definition site is a top-level definition:
$x = "foo"
$exclaimer = func() {$x + "!"} # at the definition site, $x is "foo"
$use_exclaimer = func($f, $x) {
$f() + $x # at the call site, $x is "bar"
}
test $use_exclaimer($exclaimer, "bar") {}
In that case the definition site's env
should be empty, and the call site clears out the env
when making the recursive Graph()
call, so everything is fine.
In the more complicated case, the definition site is inside another function, in this case $make_exclaimer
. This time there is no recursive call to Graph()
, so there is no opportunity for the call site's env
to be accidentally given to ExprFunc.Graph()
. Instead, the Func
s for the parameters have already been created earlier and the env
is holding on to them. It is when that already-created Func
was created that the definition site's env
was captured. The call site doesn't call this Func
directly, instead it adds an CallFunc
of its own, and it is only a runtime when that Func
sends a full.FuncValue
to that CallFunc
that the env
hidden inside this full.FuncValue
comes into play.
For Value
, this is a bit more complicated because we don't know whether if we're in the easy case or the complicated case. So the first step is to figure that out.
It should be possible to figure it out by traversing the body and check that all of the ExprVar
s refers to an ExprParam
which is either introduced by the lambda itself or by a nested lambda. If so, we are in the easy case and env
should be empty. Otherwise, we are in the more complicated case, which we can keep for later.
Actually, ExprCall.Graph()
only calls,Value()
in that easy case, so in order to validate the approach, we can just unconditionally use the empty env
. But that would cause runtime crashes in other circumstances, so we definitely can't skip the check in the name of only supporting the optimization for the rest case.
@purpleidea I have now dealt with the env
in ExprFunc.Value()
. The optimization should now work in the most common cases, but I can't check because the code doesn't compile. It wasn't compiling for me before I made my changes either.
Sweet. New branch available with your changes and fixes in feat/graph-shape. More comments here: https://github.com/purpleidea/mgmt/issues/717#issuecomment-1909226779
Sorry I've been a bit AFK, I'm mid conferencing and gave my first presentation today. One more on Monday, and then a workshop on Wednesday =D Thanks for all your great work. Ping me for anything you're stuck on or busy work that you need me to refactor/golangify/etc.
Sorry I've been a bit AFK
Me too, I was in NYC for work! I have also been elected as one of the administrators of my building, and it's taking a lot of time. I was actually working on mgmt to relax a bit from all that 😅
I'm mid conferencing and gave my first presentation today. One more on Monday, and then a workshop on Wednesday =D
Woohoo! I hope everybody is as excited about mgmt as you were hoping!
Ping me for anything
Not a chance, focus on your conference!
165 / 206 tests pass, not too bad but there is still a lot of work to do!
up to 183 / 206!
Sweet! Remember TestAstFunc1 doesn't matter since that's just shapes which will change.
On Sun, Feb 4, 2024, 05:02 Samuel Gélineau @.***> wrote:
up to 183 / 206!
— Reply to this email directly, view it on GitHub https://github.com/purpleidea/mgmt/pull/724#issuecomment-1925572476, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABA7M2S363NZG6VI4B52EDYR4B4PAVCNFSM6AAAAABAX2TBGKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMRVGU3TENBXGY . You are receiving this because you were mentioned.Message ID: @.***>
Oh, I didn't even look at TestAstFunc1, those are all TestAstFunc2 failures.
Here's the next failure:
$f = func($x) {$x + "!"}
test $f("a") {}
test $f("b") {}
Since the problem only occurs when the function is called twice, I suspect that we're trying to Revert () the nodes added by the first call, so they're no longer available for the second call.
up to 195 / 206!
The next failure is chained-vars.txtar
, slightly tweaked to make the labels unique:
import "fmt"
$zero = 0
$one = $zero + 1
$two = $one * 2 # needs a chain to panic
test fmt.printf("%d%d%d", $zero, $one, $two) {}
The test always generates the wrong Func graph, but it is non-deterministic, different runs generate different incorrect Func graphs:
I am thus guessing that when we generate the static graph and capture it via the Txn which isn't connected to the engine, the code which generates the static graph spawns some threads and not all the nodes have been added by those threads when the graph is extracted from the Txn.
I'm trying to think what would race there, that feels a bit unusual unless there was something concurrent added in this branch. I'll look at the diff tonight. Could there be a bug or an API mismatch somewhere in the TXN code?
I'll be back in a few days and on a regular hacking schedule again.
Thanks for your work here!
On Sat, Feb 10, 2024, 22:37 Samuel Gélineau @.***> wrote:
The next failure is chained-vars.txtar, slightly tweaked to make the labels unique:
import "fmt"
$zero = 0 $one = $zero + 1 $two = $one * 2 # needs a chain to panic
test fmt.printf("%d%d%d", $zero, $one, $two) {}
The test always generates the wrong Func graph, but it is non-deterministic, different runs generate different incorrect Func graphs:
failure1.png (view on web) https://github.com/purpleidea/mgmt/assets/49000/a0b0c147-0188-4b5f-9211-ec17025a4ea2 failure2.png (view on web) https://github.com/purpleidea/mgmt/assets/49000/e9553b4e-865f-42f1-84c9-6cc233a0e3ef failure3.png (view on web) https://github.com/purpleidea/mgmt/assets/49000/fe4968df-62de-48d7-89ae-75a3b0af723c failure4.png (view on web) https://github.com/purpleidea/mgmt/assets/49000/96734ece-78b9-44af-9d12-ac7d4befed98 failure5.png (view on web) https://github.com/purpleidea/mgmt/assets/49000/654e1b02-1041-495a-80c9-ec4631f58587 failure6.png (view on web) https://github.com/purpleidea/mgmt/assets/49000/54f4c86e-6670-4184-aa05-72eed9bd7e8e
I am thus guessing that when we generate the static graph and capture it via the Txn which isn't connected to the engine, the code which generates the static graph spawns some threads and not all the nodes have been added by those threads when the graph is extracted from the Txn.
— Reply to this email directly, view it on GitHub https://github.com/purpleidea/mgmt/pull/724#issuecomment-1937189321, or unsubscribe https://github.com/notifications/unsubscribe-auth/AABA7M2A4YOILXARWGNZNNTYS7R73AVCNFSM6AAAAABAX2TBGKVHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMYTSMZXGE4DSMZSGE . You are receiving this because you were mentioned.Message ID: @.***>
I think the problem is as follows.
For dynamic graphs, it is normal for a Func to hold onto the Txn and to use it to add more nodes asynchronously over time. For the optimization, I need the opposite: I need the full.FuncValue
to add all of its nodes synchronously, before we extract the static graph from the Txn. I assumed that in those cases in which we know that the graph is static, the full.FuncValue
must be adding all of its nodes synchronously. But I was probably mistaken: the graph will quickly reach an equilibrium, but it might do so asynchronously.
I tested my hypothesis by sleeping for three seconds before extracting the graph for the Txn, but that did not fix the problem. Thus, it is not the case that the graph quickly reaches an equilibrium.
Maybe it's not threads, but the random order in which the elements of a map are traversed?
Let's see, so ExprInt.Graph()
is called on 0
, and a ConstFunc is added to the Func graph.
Then, ExprCall.Graph()
is called on $zero + 1
. ExprFunc.Value()
is called speculatively on +
, and it successfully returns a full.FuncValue
. In this branch, there are now two kinds of full.FuncValue
s: timeless and timefull. You and I know that +
is timeless, but we haven't taught mgmt about this fact yet, so the returned full.FuncValue
is actually timefull.
Timeless or timefull, the optimization fires regardless, so the full.FuncValue
is only called once. This particular full.FuncValue
comes from FuncToFullFuncValue
, which adds _operator
's three input edges, synchronously. Looks like there is a step missing to actually add the _operator
itself, that's fine here since adding an edge also adds the input and output nodes, but this is going to cause problems for functions which take zero arguments.
Everything so far seems to work as intended, since that _operator
node and its input edges are always present in the GraphViz representation.
Next, ExprCall.Graph()
is called on $one * 2
. ExprFunc.Value()
is called speculatively on *
, and presumably it again successfully returns a timefull full.FuncValue
. Thus, the optimization fires again.
The FuncToFullFuncValue
adds another _operator
and its three input edges, synchronously. But this time, something goes wrong, because only a random subset of those edges appears in the GraphViz representation.
What are the differences between the +
and *
cases? I can't think of any. They both use the function _operator
, they both have one argument which is a variable and whose target is already in the Func graph when ExprCall.Graph()
is called, and one argument which is a number and is not yet in the Func graph. The one obvious difference is that $one
's target is an _operator
which already has input edges coming into it, instead of a ConstFunc
with no inputs, but what does that change?
Another difference is that $two
comes later in the Ordering dag than $one
. So if there are two threads running and there is a race, maybe the first thread always has time to process $one
but is still in the middle of processing $two
when the second thread wins the race. And the printf
, which is just completely missing from the Func graph, never even gets a chance to get processed. The fact that adding a three second sleep just before extracting the graph from the Txn didn't change anything argues against this idea, but since I'm running out of ideas, er, maybe there's a different place where sleeping would make a difference?
The FuncToFullFuncValue adds another _operator and its three input edges
This jumped out at me... Is the pointer of _operator unique or is there a chance you forgot to copy it or something and it's actually the same operator pointer as previously used?
If that's not it, then don't worry about this, we'll debug together after I look at the code more. I'm back in two days and I'll be focusing more on coding and not demoing/etc. Cheers
is there a chance you forgot to copy it or something and it's actually the same operator pointer as previously used?
I did forget initially, and one of the tests caught it, and that mistake has since been fixed. But who knows, maybe I fixed it incorrectly, just enough to fix that other test but not well enough to fix this one as well?
Nah, there are clearly two _operator
nodes on that GraphViz representation, I don't think that's it. But it's worth trying a variant with a built-in other than _operator
!
Nope, same behaviour if I use contains
instead of *
Working on #717