Corner cases of `include class as`

purpleidea commented 7 months ago

Simple example:

class c1 {
    test "t1" {}
    $x = "hello"
    class c0 {
        test "t2" {}
        $y = "goodbye"
    }
}
include c1 as i1
include i1.c0 as i0

test $i0.x {}
test $i1.y {}
panic($i0.x != "hello")
panic($i1.y != "goodbye")

Note that the vars can be used. The defined classes can be used. And any resources always "fallthrough" into the global scope.

Note that when used without as, the default name would be the same as the class definition itself, so this might be confusing, and not be entirely sensible.

purpleidea commented 7 months ago

I've pushed some non-working initial code and a bunch of tests which I think are correct. (Fixes/changes welcome!)

https://github.com/purpleidea/mgmt/tree/feat/include-class-scope

The lexer/parser stuff is correct AFAICT.

I think it's just the SetScope logic which needs fixing...

Basically the Scope adding stuff works, but it's pulling from the Scope of the Class that we're including, which wasn't set yet... Hmmm...

purpleidea commented 6 months ago

@gelisam

So I've been writing a lot of the ordering code... I seem to have fixed at least the basic ordering stuff. For this example:

-- main.mcl --
# reverse order
test $i0.f0 {}
include c0 as i0
class c0() {
    $f0 = "hello"
}
-- OUTPUT --
Vertex: test[hello]

The graph now looks like:

good-order

And then I thought of a concern:

Since we added a new "prefix:" type name (I actually named it "scoped:") is there not a worry that the ordering graph adds edges between an item that's in the wrong scope?

In that vein, I also realized that what we're currently doing in git master is:

(1) Running Ordering() (2) Running TopoSort on the graph that comes out (3) Filtering that TopoSort to eliminate any nodes that aren't in the StmtProg body that we're looping over...

So I thought to myself, how about instead we:

(1) Run Ordering() (2) Filter that graph to eliminate anything not in StmtProg's body for that iteration. (3) Run TopoSort on a much smaller graph.

I think this second option is more efficient, however it seems that the filtered graph now looks like:

graphviz-ordering-filtered dot

Naturally, this doesn't work because we're missing the edge. Previously it's an indirect edge that goes through var(i0.f0).

So:

1) Is my concern about namespace/scope issues incorrect, and it's fine to squish it all together?

2) Is my graph filtering approach wrong? Should I instead be filtering out anything that's part of the connected graph? -- But if so, wouldn't that pull in incorrect things?

At this point I'm assuming there isn't a scope issue, but I don't have a test case to prove it yet.

All my code is sitting at feat/new-set-scope. Cheers!

gelisam commented 6 months ago

Is my graph filtering approach wrong?

Yes, very wrong. Very few edges go directly from a statement to statement, most edges go from an StmtBind to an ExprVar inside another statement. So if you delete the edges which go from an StmtBind to an ExprVar, you're not going to be left with enough information to decide the ordering.

Should I instead be filtering out anything that's part of the connected graph?

I don't understand what that means. There is only a single connected component in the first DAG, which part do you want to keep and which part do you want to delete?

Is my concern about namespace/scope issues incorrect, and it's fine to squish it all together?

I don't understand your concern, can you give an example? Are you thinking about shadowing? For example, in

$x = 1
if true {
  $y = $x
  $x = 2
}

I imagine that $y = $x might receive the var:x produce from $x = 1 and incorrectly add an edge from $x = 1 to $y = $x instead of from $x = 2 to $y = $x.

If you want a more principled way to calculate the scope without having to repeat your scoping logic in SetScope and in Ordering, I recommend using "lazy computations". A lazy computation is an object which stores a computation which needs to be done, plus some other fields. Running the computation within is called "forcing" the lazy computation. After the computation has been forced, the result is stored in one of the other fields, so that the next time the computation is forced, the result can be returned immediately without running the computation a second time. As a special case, a boolean also tracks whether the computation is currently being forced, so that if the lazy computation being forced asks to force this same lazy computation, we know that there is a loop and that the computation will never end.

The idea is that we know we want to call SetScope on all the statements at some point, but we don't know in which order we should do it yet. We want to do it in dependency order, but we will only discover that order as we execute the various SetScope calls on all the statements. So:

construct a lazy scope: a map from each variable/function/class name defined in the StmtProg to a lazy computation which calls SetScope on the Stmt which binds this name.
force each lazy calculation.
if you encounter an ExprVar named f0 during the execution of those calculations, you can lookup the lazy computation for that variable and force it before processing the ExprVar, thus ensuring that SetScope has been called on the definition site before you process the first ExprVar which refers to it. In practice, the first ExprVar will call SetScope on the definition site and the later ExprVars will use the result stored in the lazy computation's other field. ExprSingleton implements this idea for Graph.
if you encounter an ExprVar named i0.f0 during the execution of those calculations, you can first force the lazy computation for i0, and then lookup f0 in the result.

gelisam commented 6 months ago

So, what's the plan? Are you implementing this ticket while I switch to whatever's the next priority thing? Or am I taking over this ticket?

gelisam commented 6 months ago

I have looked at all 10 failing tests on your feat/new-set-scope branch, and in all cases I think the behaviour is correct, it's the expected output which is wrong. I thus think we should merge it, and consider lazy computations another time.

gelisam commented 6 months ago

And this versions of the above counter-example passes as well, so it doesn't even seem like we'd be leaving something on the table.

-- main.mcl --
$x = "one"
if true { 
  $y = $x
  $x = "two"
  test $y {}
}
-- OUTPUT --
Vertex: test[two]

purpleidea commented 6 months ago

So, what's the plan? Are you implementing this ticket while I switch to whatever's the next priority thing? Or am I taking over this ticket?

I was a bit side-tracked tinkering with some provisioning code. Was likely going to resume here tomorrow and figure out what's left.

10 failing tests

I thought the test we last looked at with the args coming in was not working correctly. (Our bug) but maybe with my Ordering changes it's okay now? I'll check that tomorrow and let you know!

purpleidea commented 6 months ago

I am cleaning up the tests and the code... This popped out (new test)...

$x = "i am x"   # i am top-level

class c2() {
    $z = "i am y and " + $x

    # Since $x is pulled in from top-level automatically, we don't allow the
    # re-definition by shadowing of the same variable.
    #$x = $x    # not allowed
    $tmpx = $x
    #$x = $x + "wow"    # allowed?
    $x = $tmpx + "wow"  # allowed!
}

include c2 as f1

test $f1.z {}
test $f1.x {}   # tricky
test $f1.newx {}

None of these work because they see the ordering DAG as circular! Woops. So shadowing is not working properly. So basically this means I can't use any variable in the parent scope. I think that blocking $x = $x is okay though.

purpleidea commented 6 months ago

(But this is actually not trivial... should we be able to shadow this? Hmmm....)

purpleidea commented 6 months ago

Secondly I have an ordering bug:

graphviz-ordering dot

interpret_test.go:893: test #86: could not set scope: variable f1.x2 not in scope

This happens some of the time. Presumably because the topo sort of the dag can be traversed two ways.

-- main.mcl --
$x1 = "i am x1" # i am top-level
$x2 = "i am x2" # i am top-level

class c2() {
    $z = "i am y and " + $x1

    $x1 = "hey" # shadow
}

include c2 as f1

test $f1.z {}
test $f1.x1 {}

# the really tricky case
test $f1.x2 {}

-- OUTPUT --
Vertex: test[hey]
Vertex: test[i am x2]
Vertex: test[i am y and hey]

So I need to fix that, any ideas?

gelisam commented 6 months ago

$z = "i am y and " + $x
$tmpx = $x
$x = $tmpx + "wow"
None of these work because they see the ordering DAG as circular! Woops. So shadowing is not working properly.

$tmpx depends on $x which depends on $tmpx. Seems circular to me. Since the $x = "i am x" at the top is shadowed by the $x inside the class, this code is circular with or without $x = "i am x", right?

#$x = $x  # not allowed
#$x = $x + "wow"  # allowed?

I can't imagine any sane semantics in which one would be allowed but not the other. What's the difference?

gelisam commented 6 months ago

# the really tricky case
test $f1.x2 {}

Remember, while we paired, I pointed out that the naive implementation of include as would allow this, even though this is clearly not what we want, and we discussed a future implementation using two separate scopes which would successfully reject this. But you want to allow it?? Why?

gelisam commented 6 months ago

Btw the reason $f1.x2 is sometimes allowed is because there is no dependency from $f1.x2 to $x2 = .... There is a dependency from $f1.x2 to include c2 as f1, thanks to prefix:f1. And there is a dependency from include c2 as f1 to class c2 {...}. But there is no dependency from class c2 {...} to the variables which do not occur in c2. Adding a dependency to c2 based on how f1 is used rather than on c2 itself seems like a bad idea.

purpleidea commented 6 months ago

$tmpx depends on $x which depends on $tmpx. Seems circular to me. Since the $x = "i am x" at the top is shadowed by the $x inside the class, this code is circular with or without $x = "i am x", right?

Yeah, I'm dumb for forgetting that since $x is declared (shadowed) in the child scope, that it's circular with itself there! Woops! NOT_A_BUG.

...

    #$x = $x    # fine with me if it must be the case, but I don't _really_ care.
    #$x = $x + "wow"    # important to be able to do this.

I can't imagine any sane semantics in which one would be allowed but not the other. What's the difference?

I see your point. What I was kind of going for, was that we're shadowing the top-level $x, but also exporting a new $x which is based on the old $x. This feels kind of important, since my earlier example shows it can't depend on a tmp variable. I've updated the comments in the commit.

Remember, while we paired, I pointed out that the naive implementation of include as would allow this, even though this is clearly not what we want, and we discussed a future implementation using two separate scopes which would successfully reject this. But you want to allow it?? Why?

It's really not harmful as long as it's not blocking the above example with the "important to be able to do this" comment. If it's easier for us code wise, we can leave it. If it makes things easier to remove it, so be it. If you feel strongly about one or the other, I can be persuaded!

I've pushed these updated changes to feat/new-set-scope https://github.com/purpleidea/mgmt/tree/feat/new-set-scope If you can patch this branch to fix the ordering thing and the shadow thing, it would be most helpful. We can pair on it if you think it would go faster that way.

In parallel (if you're curious) I wrote a neat compiler/sugar hack to automatically nest classes. None of or work should be affected by it, I just thought I'd show you for fun: https://github.com/purpleidea/mgmt/tree/feat/nested-class-sugar

Cheers!

purpleidea commented 6 months ago

For reference, tests that need looking into:

SHADOWING: import-scope-classes-5.txtar

ORDERING: (run a few times to get an eventual ordering failure) import-scope-classes-4.txtar class-include-as-class0.txtar

gelisam commented 6 months ago

Hmm. Are you sure you really need to support the "important to be able to do" case? The semantics where

$x = 1
if true {
  $x = $x + 1
  $y = $x  # 2
}

makes sense is called "non recursive bindings", and in that world definitions must be ordered, you can have this:

$x = 1
$x = $x + 1
$y = $x  # 2

but not that:

$y = $x  # 1
$x = 1

The semantic in which definitions can appear out of order is called "mutually-recursive bindings" and in that world $x = $x + "wow is a loop regardless of the other bindings of $x.

gelisam commented 6 months ago

Do you also want

class c {
  include c {...}
}

and

include i.c as i

or is it only variables which get the crazy semantics?

gelisam commented 6 months ago

(I edited the examples in the previous comment, make sure to look at the GitHub version, not the email version)

purpleidea commented 6 months ago

I'm cleaning up the current branch as per our discussions. Since I didn't reply above, just for reference:

class c {
  include c {...}
}

This looks like recursive classes, and is already handled (and forbidden) in the code. Whether it would make sense is a different discussion, but I suspect probably not.

include i.c as i

or is it only variables which get the crazy semantics?

heh, I don't see any reason why the above should be allowed or why it's necessary. The reason it's not is that if you've defined a class in a particular scope, then this is effectively overwriting that identifier. You have control over this where as for variables you might not have full control. At least I mostly see things this way for now.

purpleidea commented 6 months ago

I've merged our code to git master! \o/ The corner cases are mentioned in the tests for a future patch. Thanks again!

purpleidea commented 6 months ago

For future memory, related tasks here:

Improved Ordering AST function. (A test showing a bug first would be useful.)
Potentially removing higher-scope captured variables being exposed in a child scope.
$x = $x + "foo" redefinition in a child scope when $x exists in the parent w/o cycles. (Sam magic keyword idea being implicit)

purpleidea commented 6 months ago

TODO: For james to fixup and merge: bug/ordering local branch.

purpleidea / mgmt

Corner cases of `include class as` #726