Open kostmo opened 2 months ago
Annotating each Term
with the closest function definition sounds tedious. It should be easier than that, without requiring any changes to terms: in the CESK machine, every time we look up a variable in order to evaluate it, we can set the "current function" to the name of the variable, and push a special stack frame that remembers what the previous "current function" was; when we get back to one of those stack frames, restore the old value.
every time we look up a variable in order to evaluate it
Is that done on this line?
we can set the "current function" to the name of the variable
I'd want to distinugish between variables that refer to "toplevel function definitions" vs. variables that refer to local definitions. Is that distinction tracked in the Term
type?
Is that done on this line?
Hmm, yes, that's the line I had in mind, but now that I think about it, this won't work. When we look up a variable in the environment, it is already evaluated; at some later point we may execute it (if it is a command) or reduce it (if it is a lambda that ends up getting applied to an argument) but in any case that happens at some later point unrelated to when the variable lookup happened.
Maybe annotating all terms would actually be the right way to accomplish this...
Regardless of how we accomplish it, I do worry that this feature might be difficult to maintain if we start doing more code optimization, see e.g. #1563 .
I'd want to distinugish between variables that refer to "toplevel function definitions" vs. variables that refer to local definitions. Is that distinction tracked in the
Term
type?
It is not, but I don't think it would be too hard to enhance Env
so it tracks additional information about each variable.
IIRC, the Syntax'
type is a wrapper of the Term
type, and is discarded once we are running the CESK machine
? If that were not the case, then at dialog display time we could infer the toplevel function name from the SrcLoc
of the term.
Terms have their type annotations erased before putting them in the CESK machine, but I think they still have SrcLoc
s. Or if they don't, we could easily change it so the SrcLoc
s are left in, I think.
at dialog display time we could infer the toplevel function name from the
SrcLoc
of the term.
Not unless you keep the original source code around somewhere. And keep in mind that function definitions in scope could also come from some file that was run
/imported. We could extend SrcLoc
s to contain the file name where it came from (which we should probably do anyway), but you would need to keep all the original file contents in a map or something so you could figure out the toplevel function name corresponding to a given SrcLoc
. In any case that seems like a kind of expensive and redundant way to do it --- why bother re-parsing through the original source code to find the nearest enclosing def
when we already parsed it into an AST in the first place?
Thinking about this a bit more, as a user of this functionality there should be clarity around what it means to be "in a function", or what is the "currently executing function".
Potentially one way to render the information would be as a "call stack". However, given TCO and infinite recursion as standard practice, a "stack" representation is less accurate/useful.
I think, as a developer, the most useful thing we could get out of this is a trail of "breadcrumbs" that indicate when a function of cmd
type has been invoked, leaving aside any notion of when the function has "exited". This helps obtain insight into operation of a "state machine" where each toplevel function represents a different state.
To implement this, we would could have a Label
primitive (of type Text -> Cmd Unit
) that is automatically injected into the code as the first child of an effectful function, similar to Noop
. Then we would maintain a rolling buffer of the last N
encountered Label
s for each robot.
To implement this, we would could have a
Label
primitive (of typeText -> Cmd Unit
) that is automatically injected into the code as the first child of an effectful function, similar toNoop
. Then we would maintain a rolling buffer of the lastN
encounteredLabel
s for each robot.
That sounds like a nice idea in theory. I agree the notion of a "call stack" or "currently evaluating/executing function" might be a bit ill-defined. I note, however, that inserting a label
primitive "as the first child" of an effectful function is not necessarily trivial, especially when higher order functions are involved. For example:
def on : Int -> Cmd Unit -> (Int -> Cmd Unit) = \x. \c. \y. if (x == y) {c; c; c} {} end
def f : Int -> Cmd Unit = on 3 move end
f
is clearly an "effectful function" (i.e. it has a type of the form ... -> ... -> Cmd a
) but it is unclear where we could insert a label
primitive in the definition of f
. We can't insert it before the call to on
since the result of on
has a function type, not a command type. We can't really insert it before move
, like on 3 (label "f"; move)
, because then the label
would be executed either 3 times or no times at all.
Sometimes to achieve this I have placed a say "function_name"; at the top of each function in a robot's program (i.e. "printf debugging") to accomplish this. But it would be nice to have an automatic mechanism.
How about adding a __func__
instead?
def trace : Cmd a -> Cmd Unit = \c.
try {
log ("STARTED " ++ __func__); c; log ("FINISHED " ++ __func__)
} { log ("FAILED " ++ __func__) }
end
def on : Int -> Cmd Unit -> (Int -> Cmd Unit) = \x. \c. \y. if (x == y) {c; c; c} {} end
def f : Int -> Cmd Unit = \i. trace (on 3 move i) end
EDIT: We could add label
as a command to show text in the robot panel and make it usable for other things as well.
Regardless of how we accomplish it, I do worry that this feature might be difficult to maintain if we start doing more code optimization, see e.g. https://github.com/swarm-game/swarm/issues/1563 .
Bonus points for if (__debug__)
. 😄
But implementing __func__
would be just as difficult.
I meant it to be set to the name of the definition where it is used, but I did not realize that __func__
would always be trace
in my code. 😄
If the trace
was inlined (like a macro) before replacing __func__
, it would work as intended. 🙂
If the
trace
was inlined (like a macro) before replacing__func__
, it would work as intended. 🙂
Ugh, that sounds horrid, because the meaning of __func__
then cannot be determined locally, it depends on the specifics of how inlining happens. I really want to stick with the nice, elegant semantic model we have and not go mucking about with macros, preprocessors or anything like that. (To be clear, I have no problems with inlining itself as an optimization technique, as long as it does not change the meaning of the program!)
Even without macros, having __func__
would be easy to copy-paste and not break when renaming the definition.
This would automate part of @kostmo's proposal and allow the user to decide what to do with that name.
@byorgey I guess we could replace this identifier after parsing so that it would be comparably easy to implement.
While developing a scenario, it would be useful for debugging to be able to see which function a system robot is currently "in" as it executes its program.
Sometimes to achieve this I have placed a
say "function_name";
at the top of each function in a robot's program (i.e. "printf debugging") to accomplish this. But it would be nice to have an automatic mechanism.I imagine it is feasible:
Term
with the closest "function definition" that contains it.machine
and display the function name for the currently focused term in the F2 dialog.