Closed micahjsmith closed 5 years ago
Hi @micahjsmith! You're right that the only way to gather a function right now is to assign its return value to a variable or to display the output of the call, and gather from the variable or output.
I'm open to the idea of gathering on a function, with a bit more context. Can you propose an interaction flow that would work in your case? Are you thinking like, right click on a function? Or have the function defs highlighted like the variables? And what do you want to happen with the gathered code---just what's already happening?
A few examples of functions you want to gather in existing notebooks and the code you'd want gathering could also be helpful to us making sure the slicer would handle 'em appropriately.
Whether we integrated this would also depend on if we thought functions were a generally useful thing to gather for other users. Do you have a sense of that from your own work?
@micahjsmith and I discussed this topic further over remote chat. I think what we decided was that this functionality for this could be put into another plugin.
Meanwhile, it could be useful to make nbgather
extensible to help others build plugins like that. What if we could add an interface that lets others register new gather commands? When registering a new gather command, the caller could provide a command title, an icon, and a callback that will get triggered and provided with the gathered cells. I'm opening this suggestion in a separate issue.
I've now dived a bit into the parser/slicer and hope I can clarify what I meant with this issue originally.
It seems that after a cell is executed, the controller will cause "defs" in the cell to be highlighted and those editor defs are added to the model. These defs are now selectable/available to be gathered. I see at https://github.com/microsoft/gather/blob/master/src/overlay/gather-markers.ts#L351 that defs in the parsed cell program are filtered to see if they are SymbolType.VARIABLE
. So a statement like a = lambda x: 1
is a "def" of symbol a which is a variable.
In contrast, in a statement like def a(): return 1
, symbol a is marked as SymbolType.FUNCTION
and is not added to the model as an available EditorDef.
What I am proposing is also marking defs of type SymbolType.FUNCTION
and SymbolType.CLASS
as EditorDefs available to be gathered.
The intuition is that if I can define a function or class, and then assign that object to a variable, which can then be gathered, then the original object should also be available to be gathered.
Can you propose an interaction flow that would work in your case?
Yes, treat function and class symbols identically to variable symbols. The identifier would be highlighted and would be available to be gathered.
A few examples of functions you want to gather in existing notebooks
One example is what I posted in the original post on this thread where function def f
should be gatherable even if it is not assigned to a new variable.
Whether we integrated this would also depend on if we thought functions were a generally useful thing to gather for other users. Do you have a sense of that from your own work?
For sure, suppose I am working on a data analysis project in my notebook that produces several figures and tables as output. Now I want to distribute my analysis in a reproducible manner. I would go about this by creating a new function main
:
def main():
make_figure_1()
make_figure_2()
make_table_1()
I now would gather this function (by selecting the identifier at point of definition) to a new script, add a cursory if __name__ == '__main__': main()
, and thanks to the magic of gather, I can share this script which can reproduce my results when run at the shell.
So ultimately unrelated to the issue of different backends :) but I know you have already closed this issue -- not a problem. I would hope that this change is actually not very involved, so perhaps if I am able to get a patch working for it I would ask if you wanted to reopen the issue and review a PR.
Hey @micahjsmith thanks so much for this detailed write-up! I appreciate the use case and agree that the use case of creating a cursory main and gathering it up sounds pretty slick.
I would hope that this change is actually not very involved, so perhaps if I am able to get a patch working for it I would ask if you wanted to reopen the issue and review a PR.
Sounds good to me, have at it! I think it should be a pretty straightforward fix, though let me know if you could use any pointers or a chat.
By the way, did you know that you can shift+click multiple results and in that way gather and merge together the code that produced multiple figures? Would that handle your use case above, or are you thinking that the preferred way to express that in your case is to define a parent function of the important results and gather that function?
Can functions be gathered from the point they are defined?
In the following example,
f
cannot be gathered by clicking on the function name or on the cell in which it is defined.However, if I now assign
f
to a variable, that variable can gathered, which collectsf
anda
as well as expected.Desired behavior is that after a function definition is evaluated, the function name (i.e.,
f
) is highlighted indicating it is available for gathering.