nikitakit / hydrogen-python

Python-specific plugin for hydrogen. Make Python coding in the Atom editor even more interactive!
https://atom.io/packages/hydrogen-python
MIT License
54 stars 20 forks source link

read executed line to get indentation level and expand code #10

Closed mbroedl closed 6 years ago

mbroedl commented 6 years ago

Should solve #3 in line with what @nikitakit mentioned in nteract/hydrogen#862. It reads the first line of the selection and appends the indentation to the code passed. This removes both limitations mentioned in the README.

Further solves nteract/hydrogen#1341 and nteract/hydrogen#1344 I believe. However, I couldn't find a way to move the tick bubble to the correct line (except for adding a watch and moving it by cheating). It expands the code line by line if certain (configurable) keywords are found. Maybe you have an idea about the tickbox?

kylebarron commented 6 years ago

This looks quite good. Especially the if, elif, else support.

I'll use this for a bit and let you know if I find any bugs.

kylebarron commented 6 years ago

Here's a bug. Put your cursor anywhere on the line x = 2 and run Hydrogen: Run And Move Down/Shift+Enter. It works with Hydrogen:Run though. I'm not sure what the difference is.

x = 2
if x == 1:
    print('a')
else:
    print('b')
# File "<ipython-input-17-c80c1a5b0b20>", line 2
#     print('a')
#     ^
# IndentationError: unexpected indent

Also, I'm not sure if it's possible, but it would be great for Hydrogen: Run And Move Down to move the cursor to the end of the if/else statements.

Before: image After: image

mbroedl commented 6 years ago

@kylebarron thanks for testing!! I have to admit I never used the run and move down before. run executes but stays where you had your cursor before, whereas the former moves the cursor immediately to the end of the called code.

So I checked, and it seems like the cursor is moved before hydrogen-python is called which messes up the knowledge of the plugin where it is. That's why the run and move down messes up: it executes x=2 then thinks it is in the if line already (bc the cursor is moved down, and it checks for the line number) and adds the print statement afterwards. — I could instead match the executed code in the editor, but certainly that would be a bit more hacky. :/

It would be nice if there was a way to intercept the events fired to hydrogen, so as to know what key combination/command was used... I shall have a look at this. Which action (run / run + move down) was called is crucial to knowing when to move the cursor further. But it shouldn't be too difficult.

mbroedl commented 6 years ago

@kylebarron The run and move down behaviour should be fixed now. Unless you have a selection range and want to run and move down, in which case the script understands it as run and move down without selection as the selection is not preserved.

So the displayed take of your example results in 'b' and moves down to the print(x) line. To avoid that would mean to build a complete wrapper around hydrogen... and admittedly, the cases where you would want to execute the if part of a loop and not the else part but then have your cursor at the else part to execute it immediately afterwards might be limited. At least that's what I hope, and thus that this limitation is okay... image

I haven't trialled the other run-options yet.

kylebarron commented 6 years ago

Am I missing something or is this a regression? It no longer runs the entire if/else clause: I'm using the base run here.

peek 2018-07-27 12-12

mbroedl commented 6 years ago

In my atom it works. Are you sure its the right package loaded? e.g. are you in dev mode in case you installed it to dev mode?

kylebarron commented 6 years ago

Oh sorry, yesterday I disabled then uninstalled the package. Today when installing it again it stayed disabled. So I just had to enable it in the settings again. I'll let you know if I find any other bugs

kylebarron commented 6 years ago

Another error:

x = 2
if x == 2:
    print('a')
    print('b')
else:
    print('c')

If you put your cursor anywhere on the if x == 2 line, and hit Run and Move Down, I get:

Uncaught TypeError: Cannot read property 'replace' of undefined

At /home/kyle/.atom/packages/hydrogen-python/lib/main.js:110

TypeError: Cannot read property 'replace' of undefined
    at PythonKernelMod.execute (/packages/hydrogen-python/lib/main.js:110:58)
    at MiddlewareAdapter.execute (/packages/Hydrogen/lib/kernel.js:149:24)
    at Kernel.execute (/packages/Hydrogen/lib/kernel.js:287:33)
    at Object._createResultBubble (/packages/Hydrogen/lib/main.js:391:12)
    at Object.createResultBubble (/packages/Hydrogen/lib/main.js:351:12)
    at Object.run (/packages/Hydrogen/lib/main.js:447:12)
    at HTMLElement.hydrogenRunAndMoveDown (/packages/Hydrogen/lib/main.js:108:50)
    at CommandRegistry.handleCommandEvent (/usr/share/atom/resources/app/src/command-registry.js:384:49)
    at KeymapManager.module.exports.KeymapManager.dispatchCommandEvent (/usr/share/atom/resources/app/node_modules/atom-keymap/lib/keymap-manager.js:621:22)
    at KeymapManager.module.exports.KeymapManager.handleKeyboardEvent (/usr/share/atom/resources/app/node_modules/atom-keymap/lib/keymap-manager.js:412:28)
    at WindowEventHandler.handleDocumentKeyEvent (/usr/share/atom/resources/app/src/window-event-handler.js:110:40)
kylebarron commented 6 years ago

Another separate error. If there's any indented whitespace on a line inside an if/else block, it fails with a Python IndentationError.

if x == 2:
    print('a')
    print('b')

# 4 spaces ^
#
# File "<ipython-input-17-42c9438ba923>", line 2
#     print('a')
#         ^
# IndentationError: expected an indented block
kylebarron commented 6 years ago

Also fails when the if block is already indented:

class test(object):
    def __init__(self, arg):

        if x == 2:
            print('a')

        else:
            print('c')

        # Error when line after print('a') has 4, 8, or 12 spaces
kylebarron commented 6 years ago

It also doesn't work generally if there's any space between the if and else clauses.

x = 2
if x != 2:
    print('a')

else:
    print('c')

With Run on the if line, I just get the check from Python saying None. With Run and Move down on the if line, I get IndentationError, even when the line between if and else is not indented at all.

kylebarron commented 6 years ago

I'm worried that any solution will be hacky and will have myriad edge cases that don't work.

nikitakit commented 6 years ago

@mbroedl Thanks for taking a look at the indentation level issues!

In addition to some of the edge cases that @kylebarron pointed out, I also want to mention the "run cell and move down" command (there's also plain "run cell" variant). Do you think you can get your approach to work with the full range of hydrogen commands? If you're having trouble figuring out which command triggered the code running, maybe something in the command-history package or other atom packages for tracking command execution can be helpful.

Even if the changes here only work for a particular workflow (e.g. one that avoids using certain commands), it may still be worth shipping them but they'd need to be hidden behind a configuration flag. I want to avoid users encountering these issues and then reporting them to the main hydrogen bug tracker. I think this has happened a few times already with bugs in code I wrote.

Alternatively, might you be interested in modifying hydrogen a bit to get a cleaner solution? It should be possible to have the "code manager" file in hydrogen expose some options/callbacks to the plugin API, so that the plugin can operate on full information about which code is being run instead of having to piece things together after-the-fact.

mbroedl commented 6 years ago

@kylebarron Thanks a lot for the elaborate bug reports! I shall have a look at them soon. Apart from the first one, all are indeed a bit more hacky. — You're totally right about the edge cases... but then anything that works better than now is an advancement.

@nikitakit Is there any documentation what the run-cell commands should do? I could not quite figure that out yet? — I totally understand your concerns about the config and bug trackers. Couldn't agree more. I believe indeed that the best solution would be to allow a plugin to intercept the code manager, simply because this way the response-bubble would also move, which would be one of the big downsides of any external solution. I could imagine that a simple check here: https://github.com/nteract/hydrogen/blob/master/lib/code-manager.js#L313 Something like

if (plugins[grammar].findCodeBlock) {
    return plugins[grammar].findCodeBlock(editor, row);
}

But I'm probably overlooking a lot of complexity here? The other option would be to have the plugin pass on some kind of regex or function to identify when to continue code expansion to the bottom, and regexes how to identify cells? Not sure if that would be a use case for other languages? Further, I would assume it wouldn't do any harm to remove an equal amount of indentation from every line in hydrogen directly? — Too many questions, haha.

kylebarron commented 6 years ago

It would be a huge step forward to fix these issues that Hydrogen has with Python. However with Hydrogen proper, I always know that there's a way for me to run a specific section of code, and when unexpected bugs occur with hydrogen-python, it's easiest to just turn the extension off. With if/else blocks within a function, that might mean I have to select the entire block, fully dedent it, and then run it. But it'll always work with Hydrogen proper.

There's an open issue about removing an equal amount of indentation from all lines here: https://github.com/nteract/hydrogen/issues/862. Now that I read the thread again, I think a solution along the lines of https://github.com/nteract/hydrogen/issues/862#issuecomment-308121753 would be ideal and such a PR could be accepted. I may try to delve into that and see how hard it would be.

kylebarron commented 6 years ago

@mbroedl You may want to look at my PR for Hydrogen related to common leading whitespace: https://github.com/nteract/hydrogen/pull/1363

I spent hours on a relatively in-depth solution. Then I discovered that removing 7 characters would fix it.

mbroedl commented 6 years ago

@kylebarron thanks for taking the initiative. — As said the only problem is now: where to intercept with a possible API, as hydrogen-python can not anymore guess the indentation level. Also, one would want to strip the whitespace again after the expanded code is selected. I'm not quite familiar about the order of calling though.

kylebarron commented 6 years ago

Sorry if I confused you... I meant to say that instead of the in-depth solution of trying to remove common whitespace, if I just take off the .trim() and don't remove any common whitespace, then everything works fine. The PR is not the branch in my second link above.

The reason why the current Hydrogen fails is because it removes the leading whitespace from the beginning of the selection but nowhere else.

kylebarron commented 6 years ago

My idea for expanding a selection based on if/elif/else is:

  1. Get location of cursor. Note what buffer column it's in.
  2. Find next buffer row where there is text before or on the same column as above.
  3. If the text is in a prior column, stop.
  4. If the text is in the same column, check if the next four characters are elif or else, in which case keep searching, recursively in the case of elif.

This is my idea for what might support indented if/else clauses as well.

I might first try to implement this in a my init script, having it create the full selection that I would then send to Hydrogen.

mbroedl commented 6 years ago

@kylebarron Thanks for sharing your thoughts. Are you thinking of creating a dynamic cursor selection before sending off the command to hydrogen? Make sure to ignore lines with comments or white space even when they have less indentation, etc. This chunk of the PR has something similar (slightly modified):

      let nextRowText = e.lineTextForBufferRow(lastRow + 1);
      const expandCode = [' '].concat(atom.config.get('hydrogen-python.expandCode')).join('|');
      while (nextRowText.match(new RegExp(`^(${indent}(${expandCode})|\s*#|\s$)`))) {
          code += `\n${nextRowText}`;
          lastRow += 1;
          nextRowText = e.lineTextForBufferRow(lastRow + 1);
      }

With the default for the setting being: ["else", "elif", "\\}", "\\]", "\\)"]

kylebarron commented 6 years ago

I was just describing creating a cursor selection before sending off to Hydrogen because I'm not yet familiar with the Hydrogen extension APIs.

That chunk does look similar to what I had in mind.

Are there changes you want to make to this PR given the changed functionality in Hydrogen?

mbroedl commented 6 years ago

The change actually caused some trouble because the right trim was gone, and it took me a while to realise. The first error you mentioned above might have been related to the code including the last line of the buffer? This is now solved, as well as all the other cases you mentioned.

I believe it works well now!

I included a default=false setting to enable code expansion. So make sure you tick that box before testing.

Imo, the next step would be now to fiddle into the hydrogen API that the plugin is asked what code execute before it is executed.

kylebarron commented 6 years ago

Yes, there was some discussion whether to keep the right trim or not, but the merge removed the right trim as well.

So the plugin isn't currently able to change the code before executing? At what point does it get control from Hydrogen?

mbroedl commented 6 years ago

Well, it changes the code before executing, but it doesn't seem to be able to tell hydrogen that the code has changed. Does that make sense?

Just tracing the plugin thing through the hydrogen files: The plugin seems to intercept as middleware here: https://github.com/nteract/hydrogen/blob/master/lib/plugin-api/hydrogen-kernel.js#L56 These are the interface options for middleware: https://github.com/nteract/hydrogen/blob/master/lib/plugin-api/hydrogen-types.js#L29 And here is the kernel (including calls to middleware) implemented: https://github.com/nteract/hydrogen/blob/ac02fa91c81b850642db948ed5787af2771dc4b3/lib/kernel.js

It seems as if the middleware intercepts right before sending the code to the kernel. Thus hydrogen keeps its internal idea of where to place the bubble. Ideally for our purpose, one could return the code and the range of it so that hydrogen knows where to place the bubble. But I couldn't find out how the kernel and the output (or line, even) are linked.

mbroedl commented 6 years ago

@nikitakit Now it stops executing the script when a hydrogen breakpoints occurs (by taking the syntax tokenisation), just as hydrogen core would. All the run-cell and run-cell-and-move-down commands work properly as well now, as far as I can see.

kylebarron commented 6 years ago
 const isBreakpoint = rownum => e.tokensForScreenRow(rownum).reduce((l, t) =>

Just curious, is tokensForScreenRow a TextEditor method? It seems not to exist in Atom's API documentation: https://atom.io/docs/api/search/v1.28.2?utf8=%E2%9C%93&q=tokensforscreenrow

kylebarron commented 6 years ago

I'll test out the new commits later. While it's on my mind, try/except would be another good candidate for expanding lines.

mbroedl commented 6 years ago

I think I have read somewhere that the API reference isn't quite up to date at all times...

The tokenizedBuffer used to be part of the displayBuffer and was explicitly labelled as unstable. But at some point moved into the editor namespace:

    // TODO: Remove this conditional once atom/ns-use-display-layers reaches stable and editor.tokenizedBuffer is available
 const tokenizedBuffer = editor.tokenizedBuffer ? editor.tokenizedBuffer : editor.displayBuffer.tokenizedBuffer

Source: https://github.com/atom/autocomplete-plus/blob/master/lib/symbol-store.js#L111

I noticed that the tokenizedBuffer was part of the editor in early June... there was a release in which that move happened, I believe. I would assume that the tokensForScreenRow came to the textEditor via the buffer? I'm not sure though, but there seem to be a number of token-functions in the editor namespace (I found this one via the console-autocompletion). Hydrogen uses the tokenizedBuffer which is neither documented in the API.

mbroedl commented 6 years ago

You should be able to test out try/except by adding except to the list in the settings? Will put it on the default list though. Are there any other things that we can expand?

kylebarron commented 6 years ago

@mbroedl

Maybe this is due to updating to 1.29, but I'm getting an exception from Atom: Uncaught TypeError: Cannot destructure property lineText of 'undefined' or 'null'.

Below is the full stack trace, but I find that I get this specifically when there's no more text at the original column level for the rest of the document. So it raises for

def test():
    # def __init__():
    #    return

but it doesn't raise for

def test():
    # def __init__():
    #    return

# extra characters

Atom: 1.29.0 x64 Electron: 2.0.5 OS: Ubuntu 18.04.1 Thrown From: hydrogen-python package 0.0.6

Stack Trace

Uncaught TypeError: Cannot destructure property lineText of 'undefined' or 'null'.

At /usr/share/atom/resources/app/src/text-editor.js:1255

TypeError: Cannot destructure property `lineText` of 'undefined' or 'null'.
    at TextEditor.tokensForScreenRow (/usr/share/atom/resources/app/src/text-editor.js:1255:41)
    at isBreakpoint (/home/kyle/github/atom_dev/hydrogen-python/lib/main.js:112:40)
    at PythonKernelMod.execute (/home/kyle/github/atom_dev/hydrogen-python/lib/main.js:117:15)
    at MiddlewareAdapter.execute (/packages/Hydrogen/lib/kernel.js:149:24)
    at Kernel.execute (/packages/Hydrogen/lib/kernel.js:287:33)
    at Object._createResultBubble (/packages/Hydrogen/lib/main.js:391:12)
    at Object.createResultBubble (/packages/Hydrogen/lib/main.js:351:12)
    at Object.run (/packages/Hydrogen/lib/main.js:447:12)
    at HTMLElement.hydrogenRunAndMoveDown (/packages/Hydrogen/lib/main.js:108:50)
    at CommandRegistry.handleCommandEvent (/usr/share/atom/resources/app/src/command-registry.js:384:49)
    at KeymapManager.module.exports.KeymapManager.dispatchCommandEvent (/usr/share/atom/resources/app/node_modules/atom-keymap/lib/keymap-manager.js:621:22)
    at KeymapManager.module.exports.KeymapManager.handleKeyboardEvent (/usr/share/atom/resources/app/node_modules/atom-keymap/lib/keymap-manager.js:412:28)
    at WindowEventHandler.handleDocumentKeyEvent (/usr/share/atom/resources/app/src/window-event-handler.js:110:40)
kylebarron commented 6 years ago

This originally was raised by running a large class from the class identifier and even with putting # extra chars at the end of the document, the exception is still raised with Run and Move Down and Run.

kylebarron commented 6 years ago

Thanks! I'll keep testing.

mbroedl commented 6 years ago

Thanks, @kylebarron . I could reproduce the error, put in a check for the last line when filtering for regex + breakpoints, and now the error disappeared. Let me know if there is anything more!

nikitakit commented 6 years ago

Thanks for keeping up with this, and I'm happy that the trim changes have been merged into hydrogen core!

I'll take a closer look at the code and try running it this weekend. I appreciate that you put this feature behind a default-false config, which should make it possible to ship this out faster and see what people think.

kylebarron commented 6 years ago

@mbroedl

I hit another "Uncaught TypeError: Cannot destructure property lineText of 'undefined' or 'null'.". This time it occurs specifically when running code that has been folded. So it works when my cursor is on the class line with the code

class test(object):
    def __init__(self, arg):
        super(test, self).__init__()
        self.arg = arg

    def helloworld():
        x = 2
        if x == 2:
            print('hello world')
        else:
            print('not 2')

    def test2():
        try:
            pass
        except Exception as e:
            raise

if __name__ == '__main__':
    main()

But it fails when the def lines are folded and my cursor is on the class line.

class test(object):
    def __init__(self, arg):
    # folded

    def helloworld():
    # folded

    def test2():
    # folded

if __name__ == '__main__':
    main()

I'm guessing that this has something to deal with using tokensForScreenRow. And then screen vs buffer coordinates might be off? It doesn't appear that there's a tokensForBufferRow function.

mbroedl commented 6 years ago

Thanks for the detailed report @kylebarron . No tokens for screen row, but this worked e.tokenizedBuffer.tokenizedLines[rownum].tokens, although then they are in the point syntax (foo.breakpoint) instead of in the css syntax (syntax--foo syntax--breakpoint).

I also added except and finally to the defaults. Looking at a list of keywords I think these four (else, elif, except, finally) should be all for which python continues on the same indentation level. Or did I miss anything?

kylebarron commented 6 years ago

I think that's all. The regex extensibility is great. I added plt\. so that I could run

plt.plot()
plt.xlabel()
plt.ylabel()
plt.show()

without having to select the lines.

mbroedl commented 6 years ago

Oh, that's sweet! I hope it won't ever surprise you! It might be a nice example in the README file?

kylebarron commented 6 years ago

Yeah it could be a nice user-defined example

nikitakit commented 6 years ago

One more note (doesn't have to be in this PR): if this feature were extended to handle regexes before the currently-selected code, as opposed to just after, it would also be able to properly attach decorators to functions. See https://github.com/nteract/hydrogen/issues/77 for details.

mbroedl commented 6 years ago

@nikitakit Thanks for the elaborate review. Will work on them next week!

kylebarron commented 6 years ago

Here's another bug: peek 2018-08-08 15-56

kylebarron commented 6 years ago

@mbroedl This might be the same bug. If I have my cursor at for and run Run and Move Down, it tries to add all the text in the selection apparently

        for (Token, in_line), log_line in zip(syn_chunks, log):
            # Split log_line on \r\n.
            if str(Token) != 'Token.MatchingBracket.Other':
                # Since I'm sending one line at a time, and since it's not a
                # block, the first line should equal the text sent
                # The assert is just a sanity check for now.
                log_l = log_line.split('\r\n')
                assert log_l[0] == in_line
                log_all.extend(log_l[1:])
                log_all.append('')

        log
        syn_chunks

    def export_graph(self):
        """
        NOTE: unclear whether I actually need to save all the graphs individually, or if I can just overwrite one. Since I'm just writing them to load them into Python, and the next graph shouldn't start writing until I'm done reading the previous one into Python, I can probably just overwrite the same file.
        """
        # Export graph to file
        cmd = 'qui graph export `"{0}/graph{1}.{2}"\' , as({2}) replace'.format(
            self.cache_dir, self.graph_counter, self.graph_format)
        self.do(cmd)

image

kylebarron commented 6 years ago

I forgot to mention... It happens specifically when I have exactly 2 empty lines between syn_chunks and def export_graph. When I have 1 or 3 empty lines between them it works fine.

The console thinks it only expanded by one line: image

mbroedl commented 6 years ago

@nikitakit I've worked through your comments and now it should be adjusted accordingly.

@kylebarron Thanks for the detailed bug report. After fixing the first one, I could not reproduce the second one,so I assume it's been fixed now as well!

mbroedl commented 6 years ago

@nikitakit The latest commit also includes decorators in the code expansion.

nikitakit commented 6 years ago

Just had a chance to test the feature out out today. Everything appears to work as advertised, and the decorator handling is a nice and useful addition!

I'm going to merge this now and put out a new release soon. Unfortunately I didn't get the chance to test this super-thoroughly because I don't have any active projects where I'm using hydrogen every day. I hope that you'd be willing to take a look if any github issues are filed after this feature is out!

I understand that there are some limitations because hydrogen doesn't provide the right API for this sort of feature. For example, having a selection and calling run-and-move-down will result in the selection going away. My view on this is there are diminishing returns to building increasingly complex workarounds, and any further effort might be better spent on improving hydrogen core to fix the core problem.

The two directions I'd consider exploring are:

  1. A plugin API that explicitly deals with modifying execution regions. That way hydrogen can directly provide which region in the buffer it intends to execute and why (run at cursor vs. run selection vs run cell), and a plugin can modify the start and endpoints of the executed region
  2. Using the new tree-sitter parsers. Currently hydrogen determines which code to run using Atom's code-folding APIs, which are based purely on indentation. The new tree-sitter parsers provide a syntax-tree view of the code, and there might be a language-agnostic manner to pull out code regions to execute in hydrogen itself.
mbroedl commented 6 years ago

Thanks for merging @nikitakit . Yeah, happy to keep a look at issues, in case there's some coming.

For example, having a selection and calling run-and-move-down will result in the selection going away.

I believe this is the default behaviour of hydrogen, rather than hydrogen-python.

I agree that hydrogen should be the target for major future improvements! Tree sitter looks pretty nice and solid for this purpose, and it should cover all the originally intended use cases, but not so the nice side effects such as extending on e.g. plt.. Otherwise, the plugin API would always work as well, but would probably be a lot more effort to maintain for many languages.