microsoft / gather

Spit shine for Jupyter notebooks 🧽✨
https://microsoft.github.io/gather
MIT License
532 stars 38 forks source link

Dependencies of class properties are not gathered #23

Closed joyceerhl closed 5 years ago

joyceerhl commented 5 years ago

Describe the bug When a function references dependencies of class properties set either in the class constructor or class functions, those dependencies are not gathered.

To Reproduce Steps to reproduce the behavior:

  1. Build and run Gather extension.
  2. Upload this notebook to Jupyter.
  3. Execute all cells.
  4. Try gathering the outputs for each case.

Expected behavior Dependencies like variable declarations and module imports should be gathered.

Screenshots gather_setstate_deps

andrewhead commented 5 years ago

Hi @joyceerhl! Thanks for pointing out this bug and for the helpful reproduction :-)

I think I've figured out the issue here---the dataflow code needs to be extended to look for uses of names within classes. One fix for this exact case will be to fill in the empty case for ast.CLASS here to collect the names used from all def's (update: all defs and variable definitions) in the class (e.g., by calling getUses on all of the class's functions).

I can take a go at it, though it will take me a couple more days. I'm also happy to walk you through the process of updating it if you've got some time on your hands and are feeling adventurous ^_^

andrewhead commented 5 years ago

By the way, wanted to thank you again for helping us improve the parser!

joyceerhl commented 5 years ago

@andrewhead I can take a crack at it! :-)

andrewhead commented 5 years ago

Sweet! Do you want to set up a remote call to talk through any of this, or just take a go and follow up as any questions come up? Lmk as I'm happy to help!

andrewhead commented 5 years ago

@joyceerhl Great pull request! Do we want to keep this issue open (i.e. is there any other refinement of that functionality we should keep in mind for the near future), or is this good to close?

joyceerhl commented 5 years ago

I just found a related bug while testing Gather on classes:

# Cell 1
def func():
    pass
class Foo():
    def bar(self):
        func()
# Cell 2
Foo().bar()

If you gather the output of Cell 2 to a notebook, the resultant output contains class Foo(): twice:

# Cell 1
def func():
    pass
class Foo():
class Foo():
    def bar(self):
        func()
# Cell 2
Foo().bar()

Edit: I should also mention that this is not a problem if you define func and Foo in two separate cells like so:

# Cell 1
def func():
    pass
# Cell 2
class Foo():
    def bar(self):
        func()
# Cell 3
Foo().bar()

I only noticed this after we merged #28 which is why I'm reporting it here, but this could also be a bug with how sliceLocations are computed. Would love to get your thoughts on this one!

joyceerhl commented 5 years ago

Closing this issue for now as #31 is a separate problem and the class gathering functionality seems complete for now!