microsoft / gather

Spit shine for Jupyter notebooks 🧽✨
https://microsoft.github.io/gather
MIT License
532 stars 38 forks source link

Function and class definition textSliceLines are incorrectly computed #31

Closed joyceerhl closed 5 years ago

joyceerhl commented 5 years ago

Describe the bug Originally reported in #23 for classes, but predates #28 The line after a class or function definition is included twice in the gathered program.

Here's an example illustrating the problem for funcdefs after executing the following program as a single cell in Jupyter:

def foo():
    print("Hello") 
def bar(): 
    foo()
bar() 

The gathered program looks like this:

def foo():
    print("Hello")
def bar():
def bar():
    foo()
bar()
bar()

Here's another example illustrating the problem for classes:

class Foo():
    class Bar():
        pass
v=1
Foo().Bar()

The gathered program looks like this:

class Foo():
    class Bar():
        pass
v=1
v=1
Foo().Bar()

Additional information I'm somewhat out of my depth with debugging the parser, but it seems like the problem lies with the ILocation object that the parser returns, since for class and funcdefs, loc.last_column === 0 and loc.last_line is the true last line incremented by one. textSliceLines is then incorrectly computed based on the values of last_column and last_line in cellSlice.ts--in this case, lines 3 and 5 are included twice. I've added unit tests that illustrate this behavior here.

If I'm not mistaken, this means a fix will involve modifying the parser to return an ILocation object whose last_column and last_line properties have accounted for the dedent. Alternatively, the cellSlice.getTextSlice logic could be updated.

rdeline commented 5 years ago

Thanks for catching this bug and a quick PR to fix it. I looked at the original code and the fix and found the whole method kind of confusing. I just submitted a version that's a little cleaner (I think). It didn't break any tests, so hopefully it's backward-compatible.