Open purpleidea opened 7 months ago
I've pushed some non-working initial code and a bunch of tests which I think are correct. (Fixes/changes welcome!)
https://github.com/purpleidea/mgmt/tree/feat/include-class-scope
The lexer/parser stuff is correct AFAICT.
I think it's just the SetScope logic which needs fixing...
Basically the Scope adding stuff works, but it's pulling from the Scope of the Class that we're including, which wasn't set yet... Hmmm...
@gelisam
So I've been writing a lot of the ordering code... I seem to have fixed at least the basic ordering stuff. For this example:
-- main.mcl --
# reverse order
test $i0.f0 {}
include c0 as i0
class c0() {
$f0 = "hello"
}
-- OUTPUT --
Vertex: test[hello]
The graph now looks like:
And then I thought of a concern:
Since we added a new "prefix:" type name (I actually named it "scoped:") is there not a worry that the ordering graph adds edges between an item that's in the wrong scope?
In that vein, I also realized that what we're currently doing in git master is:
(1) Running Ordering() (2) Running TopoSort on the graph that comes out (3) Filtering that TopoSort to eliminate any nodes that aren't in the StmtProg body that we're looping over...
So I thought to myself, how about instead we:
(1) Run Ordering() (2) Filter that graph to eliminate anything not in StmtProg's body for that iteration. (3) Run TopoSort on a much smaller graph.
I think this second option is more efficient, however it seems that the filtered graph now looks like:
Naturally, this doesn't work because we're missing the edge. Previously it's an indirect edge that goes through var(i0.f0)
.
So:
1) Is my concern about namespace/scope issues incorrect, and it's fine to squish it all together?
2) Is my graph filtering approach wrong? Should I instead be filtering out anything that's part of the connected graph? -- But if so, wouldn't that pull in incorrect things?
At this point I'm assuming there isn't a scope issue, but I don't have a test case to prove it yet.
All my code is sitting at feat/new-set-scope
. Cheers!
Is my graph filtering approach wrong?
Yes, very wrong. Very few edges go directly from a statement to statement, most edges go from an StmtBind to an ExprVar inside another statement. So if you delete the edges which go from an StmtBind to an ExprVar, you're not going to be left with enough information to decide the ordering.
Should I instead be filtering out anything that's part of the connected graph?
I don't understand what that means. There is only a single connected component in the first DAG, which part do you want to keep and which part do you want to delete?
Is my concern about namespace/scope issues incorrect, and it's fine to squish it all together?
I don't understand your concern, can you give an example? Are you thinking about shadowing? For example, in
$x = 1
if true {
$y = $x
$x = 2
}
I imagine that $y = $x
might receive the var:x
produce from $x = 1
and incorrectly add an edge from $x = 1
to $y = $x
instead of from $x = 2
to $y = $x
.
If you want a more principled way to calculate the scope without having to repeat your scoping logic in SetScope and in Ordering, I recommend using "lazy computations". A lazy computation is an object which stores a computation which needs to be done, plus some other fields. Running the computation within is called "forcing" the lazy computation. After the computation has been forced, the result is stored in one of the other fields, so that the next time the computation is forced, the result can be returned immediately without running the computation a second time. As a special case, a boolean also tracks whether the computation is currently being forced, so that if the lazy computation being forced asks to force this same lazy computation, we know that there is a loop and that the computation will never end.
The idea is that we know we want to call SetScope on all the statements at some point, but we don't know in which order we should do it yet. We want to do it in dependency order, but we will only discover that order as we execute the various SetScope calls on all the statements. So:
f0
during the execution of those calculations, you can lookup the lazy computation for that variable and force it before processing the ExprVar, thus ensuring that SetScope has been called on the definition site before you process the first ExprVar which refers to it. In practice, the first ExprVar will call SetScope on the definition site and the later ExprVars will use the result stored in the lazy computation's other field. ExprSingleton implements this idea for Graph.i0.f0
during the execution of those calculations, you can first force the lazy computation for i0
, and then lookup f0
in the result.So, what's the plan? Are you implementing this ticket while I switch to whatever's the next priority thing? Or am I taking over this ticket?
I have looked at all 10 failing tests on your feat/new-set-scope
branch, and in all cases I think the behaviour is correct, it's the expected output which is wrong. I thus think we should merge it, and consider lazy computations another time.
And this versions of the above counter-example passes as well, so it doesn't even seem like we'd be leaving something on the table.
-- main.mcl --
$x = "one"
if true {
$y = $x
$x = "two"
test $y {}
}
-- OUTPUT --
Vertex: test[two]
So, what's the plan? Are you implementing this ticket while I switch to whatever's the next priority thing? Or am I taking over this ticket?
I was a bit side-tracked tinkering with some provisioning code. Was likely going to resume here tomorrow and figure out what's left.
10 failing tests
I thought the test we last looked at with the args coming in was not working correctly. (Our bug) but maybe with my Ordering changes it's okay now? I'll check that tomorrow and let you know!
I am cleaning up the tests and the code... This popped out (new test)...
$x = "i am x" # i am top-level
class c2() {
$z = "i am y and " + $x
# Since $x is pulled in from top-level automatically, we don't allow the
# re-definition by shadowing of the same variable.
#$x = $x # not allowed
$tmpx = $x
#$x = $x + "wow" # allowed?
$x = $tmpx + "wow" # allowed!
}
include c2 as f1
test $f1.z {}
test $f1.x {} # tricky
test $f1.newx {}
None of these work because they see the ordering DAG as circular! Woops. So shadowing is not working properly. So basically this means I can't use any variable in the parent scope. I think that blocking $x = $x is okay though.
(But this is actually not trivial... should we be able to shadow this? Hmmm....)
Secondly I have an ordering bug:
interpret_test.go:893: test #86: could not set scope: variable f1.x2 not in scope
This happens some of the time. Presumably because the topo sort of the dag can be traversed two ways.
-- main.mcl --
$x1 = "i am x1" # i am top-level
$x2 = "i am x2" # i am top-level
class c2() {
$z = "i am y and " + $x1
$x1 = "hey" # shadow
}
include c2 as f1
test $f1.z {}
test $f1.x1 {}
# the really tricky case
test $f1.x2 {}
-- OUTPUT --
Vertex: test[hey]
Vertex: test[i am x2]
Vertex: test[i am y and hey]
So I need to fix that, any ideas?
$z = "i am y and " + $x $tmpx = $x $x = $tmpx + "wow"
None of these work because they see the ordering DAG as circular! Woops. So shadowing is not working properly.
$tmpx
depends on $x
which depends on $tmpx
. Seems circular to me. Since the $x = "i am x"
at the top is shadowed by the $x
inside the class, this code is circular with or without $x = "i am x"
, right?
#$x = $x # not allowed #$x = $x + "wow" # allowed?
I can't imagine any sane semantics in which one would be allowed but not the other. What's the difference?
# the really tricky case test $f1.x2 {}
Remember, while we paired, I pointed out that the naive implementation of include as
would allow this, even though this is clearly not what we want, and we discussed a future implementation using two separate scopes which would successfully reject this. But you want to allow it?? Why?
Btw the reason $f1.x2
is sometimes allowed is because there is no dependency from $f1.x2
to $x2 = ...
. There is a dependency from $f1.x2
to include c2 as f1
, thanks to prefix:f1
. And there is a dependency from include c2 as f1
to class c2 {...}
. But there is no dependency from class c2 {...}
to the variables which do not occur in c2
. Adding a dependency to c2
based on how f1
is used rather than on c2
itself seems like a bad idea.
$tmpx depends on $x which depends on $tmpx. Seems circular to me. Since the $x = "i am x" at the top is shadowed by the $x inside the class, this code is circular with or without $x = "i am x", right?
Yeah, I'm dumb for forgetting that since $x is declared (shadowed) in the child scope, that it's circular with itself there! Woops! NOT_A_BUG.
...
#$x = $x # fine with me if it must be the case, but I don't _really_ care.
#$x = $x + "wow" # important to be able to do this.
I can't imagine any sane semantics in which one would be allowed but not the other. What's the difference?
I see your point. What I was kind of going for, was that we're shadowing the top-level $x
, but also exporting a new $x
which is based on the old $x. This feels kind of important, since my earlier example shows it can't depend on a tmp variable. I've updated the comments in the commit.
Remember, while we paired, I pointed out that the naive implementation of include as would allow this, even though this is clearly not what we want, and we discussed a future implementation using two separate scopes which would successfully reject this. But you want to allow it?? Why?
It's really not harmful as long as it's not blocking the above example with the "important to be able to do this" comment. If it's easier for us code wise, we can leave it. If it makes things easier to remove it, so be it. If you feel strongly about one or the other, I can be persuaded!
I've pushed these updated changes to feat/new-set-scope https://github.com/purpleidea/mgmt/tree/feat/new-set-scope If you can patch this branch to fix the ordering thing and the shadow thing, it would be most helpful. We can pair on it if you think it would go faster that way.
In parallel (if you're curious) I wrote a neat compiler/sugar hack to automatically nest classes. None of or work should be affected by it, I just thought I'd show you for fun: https://github.com/purpleidea/mgmt/tree/feat/nested-class-sugar
Cheers!
For reference, tests that need looking into:
SHADOWING: import-scope-classes-5.txtar
ORDERING: (run a few times to get an eventual ordering failure) import-scope-classes-4.txtar class-include-as-class0.txtar
Hmm. Are you sure you really need to support the "important to be able to do" case? The semantics where
$x = 1
if true {
$x = $x + 1
$y = $x # 2
}
makes sense is called "non recursive bindings", and in that world definitions must be ordered, you can have this:
$x = 1
$x = $x + 1
$y = $x # 2
but not that:
$y = $x # 1
$x = 1
The semantic in which definitions can appear out of order is called "mutually-recursive bindings" and in that world $x = $x + "wow
is a loop regardless of the other bindings of $x
.
Do you also want
class c {
include c {...}
}
and
include i.c as i
or is it only variables which get the crazy semantics?
(I edited the examples in the previous comment, make sure to look at the GitHub version, not the email version)
I'm cleaning up the current branch as per our discussions. Since I didn't reply above, just for reference:
class c {
include c {...}
}
This looks like recursive classes, and is already handled (and forbidden) in the code. Whether it would make sense is a different discussion, but I suspect probably not.
include i.c as i
or is it only variables which get the crazy semantics?
heh, I don't see any reason why the above should be allowed or why it's necessary. The reason it's not is that if you've defined a class in a particular scope, then this is effectively overwriting that identifier. You have control over this where as for variables you might not have full control. At least I mostly see things this way for now.
I've merged our code to git master! \o/ The corner cases are mentioned in the tests for a future patch. Thanks again!
For future memory, related tasks here:
TODO: For james to fixup and merge: bug/ordering local branch.
Simple example:
Note that the vars can be used. The defined classes can be used. And any resources always "fallthrough" into the global scope.
Note that when used without
as
, the default name would be the same as the class definition itself, so this might be confusing, and not be entirely sensible.