nteract / hydrogen

:atom: Run code interactively, inspect data, and plot. All the power of Jupyter kernels, inside your favorite text editor.
https://nteract.gitbooks.io/hydrogen/
MIT License
3.92k stars 334 forks source link

Request: strip _common_ leading whitespace from input before passing to kernel #862

Closed sglyon closed 6 years ago

sglyon commented 7 years ago

I'd love it if hydrogen stripped leading whitespace that was common to all lines in the selection before passing to the kernel.

For example, suppose I have this python code:

def build_grids(p: Params):
    # build grid over I choices and N states
    Imax = 10
    expected_life = (1-p.δ) / p.δ
    Igrid = np.arange(1, Imax+1)
    Ngrid = np.arange(0, round(Imax*expected_life)+1, 2)

    zgrid = p.z
    # ... more code ....

If I were to highlight from the start of the commented line # build grid... (including the leading whitespace) all the way to p.z, and try to evaluate I would get a complaint from the python kernel about unexpected whitespace.

If I were to take that same highlighted region and copy it to my system clipboard and then use the %paste magic in an ipython terminal session, the leading whitespace would be stripped and evaluation would proceed without errors.

Note that in my example if I were to start my selection at the I character in Imax then the behavior should be to still throw an error because the first line in my selection doesn't have leading whitespace in common with all other lines.

I was going to try to take a stab at this, but I don't know where to look inside here. I figured you would know better than me ;)

BenRussert commented 7 years ago

My first impression is that there isn't a truly language agnostic way to do this from hydrogen's end.

Interesting issue, I think an init script or plugin solution could be fairly straight forward, but I'm not sure this should be built into hydrogen itself.

rgbkrk commented 7 years ago

Is there a precedent for this with the jupyter notebook?

sglyon commented 7 years ago

No, the notebook doesn't do this, but the console does... with the %paste magic:

screen shot 2017-06-12 at 10 50 29 pm
~|⇒ ipython                                                                                       vpn-192-168-100-7
Python 3.6.1 |Anaconda 4.4.0 (x86_64)| (default, May 11 2017, 13:04:09)
Type "copyright", "credits" or "license" for more information.

IPython 5.3.0 -- An enhanced Interactive Python.
?         -> Introduction and overview of IPython's features.
%quickref -> Quick reference.
help      -> Python's own help system.
object?   -> Details about 'object', use 'object??' for extra details.

In [1]:
   ...:     def foo():
   ...:         return "hello!"
   ...:
   ...:
   ...:     foo()
  File "<ipython-input-1-ede5e3b3ed2a>", line 2
    def foo():
    ^
IndentationError: unexpected indent

In [2]: paste

    def foo():
        return "hello!"

    foo()

## -- End pasted text --
Out[2]: 'hello!'
BenRussert commented 7 years ago

It may not be an issue in notebook since it normally runs code in cells instead of selection based

sglyon commented 7 years ago

Good point ☝️

I think that is the main reason I've never missed that in the notebook, but am happy to have it in the console and would like to see it here.

If I were to try to put this in my init script is there a recommended way for me to add a processing layer before hydrogen sends code to the kernel?

BenRussert commented 7 years ago

If I were to try to put this in my init script is there a recommended way for me to add a processing layer before hydrogen sends code to the kernel?

Not exactly, (yet :smile: ) .

Now that I think it through in order to use the init solution you would probably need to change the actual text buffer in selection then change it back after dispatching hydrogen:run. Could get pretty janky.

I think in this case the best way to enable this would be through enhancing our plugin api (something we have talked about doing for other reasons anyway).

nikitakit commented 7 years ago

This issue is something I've been running into as well in my usage. I wouldn't be opposed to having hydrogen strip out common leading whitespace, rather than offloading this to a plugin. The criteria for this are pretty simple (detect leading whitespace on first line, and strip it from later lines) so it doesn't actually require any language-specific parsing.

BenRussert commented 7 years ago

It would be language specific just in the sense that there is no need to do this in most languages (the described workflow works fine in julia, js, R, etc.)

on the other hand, I guess if we are doing any language a favor it should be Python.

My other concern would be: can this have side effects for other kernels? Seems unlikely, but something to think about.

t9md commented 7 years ago

I agree with @BenRussert, but still want this feature supported by hydrogen. Since it's just simple and easy than ask/wait for each kernel provider to support this.

How about implement this indent removing feature and make it configurable by scope name.

One side effect for this is, code is evaluated in same environment unless restarting kernel. If user selected some deep indented lines of code, it evaluated as no indented level, this affect scope of vars its effective depending on language. But this is also true for hydrogen:run it can change order of code evaluated, so user have to being careful for the effect of these commands anyway.

n-riesco commented 7 years ago

I think this should really be in its own command (and likely in a language-specific plugin). Otherwise, if trimming was the default behaviour, how would we handle:

lgeiger commented 7 years ago

I think it would be OK to strip whitespace on code that is executed on a selection basis.

We could quite easily add this here since we already normalize the line endings before sending code to a kernel.

how would we handle multi-line strings?

Is this really a problem if leading whitespace is stripped?

how would we handle the pollution of the global scope?

I don't think we need to handle this case. Right now one can already execute deeply nested code by selecting it line by line. Stripping the whitespace only makes it convenient to execute multiple nested lines.

BenRussert commented 7 years ago

@lgeiger Good points.

I like @t9md's thoughts above about how to make this configurable by scope name. It seems like a good thing to make configurable just in case.

kkonevets commented 6 years ago

pyCharm can handle this issue pretty well, why can't hydrogen do the same?

rgbkrk commented 6 years ago

PyCharm is funded, Hydrogen is not. We operate on contributions to the code for changes. As for why not it's because no one has done it, including yourself @kkonevets.

nikitakit commented 6 years ago

I thought that updates to the plugin API would make it easy to implement this feature as a plugin, but it turns out that this is not the case.

It so happens that hydrogen always trims the string before sending it to the kernel, so e.g.

    x = 5
    print(x)

becomes

x = 5
    print(x)

and it's rather difficult for a plugin to undo that transformation and detect that actually there was common leading whitespace all along.

(Just wanted to post this thought here in case it sparks any ideas)

nikitakit commented 6 years ago

Now that the plugin API changes are merged, it's possible to mostly implement this feature in a plugin. I have a working plugin implementation at https://github.com/nikitakit/hydrogen-python

Note that there are some limitations (related to my comment above), but at least in my code those cases occur infrequently compared to the many times I want to run code in an indented block.

kylebarron commented 6 years ago

In @sglyon's comment, he has:

In [1]:
   ...:     def foo():
   ...:         return "hello!"
   ...:
   ...:
   ...:     foo()
  File "<ipython-input-1-ede5e3b3ed2a>", line 2
    def foo():
    ^
IndentationError: unexpected indent

Did this behavior change between IPython 5 and IPython 6? Because it works now:

image
kylebarron commented 6 years ago

I reproduced this in IPython 5.3.0, and it works fine.

image

The reason why it didn't work in the above comment is because there's a silent \n on line 1, hence why the IndentationError is raised on line 2.

image

Given that IPython is fine with common whitespace, as long as Hydrogen doesn't remove leading whitespace, this issue should be solved. See https://github.com/nteract/hydrogen/pull/1363.