microsoft / gather

Spit shine for Jupyter notebooks 🧽✨
https://microsoft.github.io/gather
MIT License
532 stars 38 forks source link

Ensure parser returns empty program when parsing comments #27

Closed joyceerhl closed 5 years ago

joyceerhl commented 5 years ago

When cell.text is a comment-only string, or when rewriting magics results in a comment-only string, ast.parse() returns an empty string: https://github.com/microsoft/gather/blob/0ad10fd2c4ecd01dc4857c8e779b6c01ded2e00f/src/analysis/slice/program-builder.ts#L90-L92

ProgramBuilder.add()then throws a TypeError when we try to walk the tree:

Couldn't analyze block <comment-only code> , error encountered,  TypeError: tree.code is not iterable

And in python-parser.ts, walkRecursive logs the following error to the console:

Node undefined. Ancestors:

This doesn't crash the extension because it's surrounded with a try/catch, but I'm submitting a PR for this because it pollutes logs whenever we execute magics-only or comment-only cells otherwise.

andrewhead commented 5 years ago

Sweet, good find! This makes me realize---isn't it weird that the parser doesn't return a program when it parses a comment? Shouldn't it? Could we just rewrite the grammar to make sure it returns an empty program when all it has is a line comment?

joyceerhl commented 5 years ago

The parser does return an empty string when all it has is a line comment, so in line 91,

let tree = ast.parse(this._magicsRewriter.rewrite(cell.text) + '\n') // Here tree === empty string
statements = tree.code // Attempting to access tree.code throws a TypeError

We then attempt to walk an empty string, and the walkRecursive method logs a console error: https://github.com/microsoft/gather/blob/0ad10fd2c4ecd01dc4857c8e779b6c01ded2e00f/src/analysis/slice/program-builder.ts#L94

I suppose we could just not attempt the walk if tree === ''? Whichever makes more sense :-)

andrewhead commented 5 years ago

So cool! I'm thinking a robust fix for this is to make sure that a tree's code is always set to an empty list (to mean, semantically, "no statements" rather than "null tree"). Could we try that? What I'm thinking is setting the fileinput: EOF rule in the grammar (here), to have this return value:

{ $$ = { type: 'module', code: [], location: @$ } }

And then also adding a unit test to https://github.com/microsoft/gather/blob/master/src/test/parser.test.ts to make sure that the parsed code of the parse tree is an empty array. What do you think of that fix?

joyceerhl commented 5 years ago

Could we just rewrite the grammar to make sure it returns an empty program when all it has is a line comment?

Sorry, just realized I misunderstood what you meant by 'empty program' 😅 I've implemented the suggested fix and added a unit test.

andrewhead commented 5 years ago

Looks great!