jkitchin / scimax

An emacs starterkit for scientists and engineers
Other
1.03k stars 123 forks source link

scimax-org-babel-ipython.el and remote kernels #114

Closed zngguvnf closed 6 years ago

zngguvnf commented 7 years ago

Does scimax-org-babel-ipython.el support remote kernels?

jkitchin commented 7 years ago

I am not sure. if the base ob-ipython does than i would expect scimax too.

roblem commented 7 years ago

Have spent some time trying to get this to work in both scimax and upstream ob-ipython and have been unsuccessful with both. Instead, use scimax with the additional command line parameters -nw in a remote console.

zngguvnf commented 7 years ago

How do you deal with things like plots or sounds when you start emacs/scimax with -nw on a remote console?

roblem commented 7 years ago

As you point out, when I run things on the remote machine and the charts are created but can't be displayed.

To somewhat get around this,

  1. Use something like dropbox for keeping the working directory synced across the remote machine and local machine
  2. Run long-running job on remote machine using scimax and the -nw switch. Include code for creating charts as you normally would (the :results output you will see there won't include the chart but will reference it) and it is saved in the working directory in a folder called ipython-inline-images
  3. Upon completion on the remote machine, save file and close buffer
  4. After syncing locally, open same org file on local machine and the figures created remotely are viewable in on the local emacs instance.

Obviously, this is not going to work for fine-tuning a figure but it is mostly workable for using scimax remotely. I usually save the long-running results to a file and load them on the local machine if I really need to fine tune stuff.

hummuscience commented 6 years ago

I will piggyback on this and ask my question. I have an Ubuntu VM that I can access via ssh on a remote machine. I have some heavy analysis coming up that involves python, R etc. and I would like to run that on the machine from my local emacs/scimax.

How do I do it/where do I start? Could someone give me a short step by step on how to do that from org-mode or using emacs in general?

jkitchin commented 6 years ago

I have no idea how to run on remote kernels or if it is possible. It isn't a use case for me, and I have no way to test it for now, so I probably won't be much help in figuring it out. It sounds like you should just use Jupyter notebooks for that.

zngguvnf commented 6 years ago

Check out this: https://vxlabs.com/2017/11/30/run-code-on-remote-ipython-kernels-with-emacs-and-orgmode/

hummuscience commented 6 years ago

Thank you @zngguvnf for the link. I wonder how it works for R processes

roblem commented 6 years ago

I tried the approach outlined in the @zngguvnf link awhile back and just tried it again and it still isn't working. I setup a remote kernel with the ssh link (or it can even be an existing local kernel) and this code block in scimax:

#+NAME: mirror-sixteen-earth-robin
#+BEGIN_SRC ipython :session kernel-14489-ssh.json :exports both :results raw drawer
x = 1
print(x)
#+END_SRC

gives this error message:

user-error: The :session name (kernel-14489.json) cannot contain a -.

Haven't gotten around to trying it again in vanilla ob-ipython/emacs setup.

jkitchin commented 6 years ago

That means you should take the dashes out of the session name. I recall those were problematic in the past, so now it raises an error.

roblem commented 6 years ago

Thanks. That occurred to me also. On that, two things:

  1. Using the connection method described in the above link necessarily requires the kernel connection string to include -ssh (so dashes will be in the kernel name). Also, the instructions in the link have kernel names with dashes and things seem to work.

  2. I created a local kernel without any dashes and when I submit the codeblock

    #+NAME: mirror-sixteen-earth-robin
    #+BEGIN_SRC ipython :session mykernel.json :exports both :results raw drawer
    x = 1
    print(x)
    #+END_SRC

    I get error messages and a new kernel is created in /var/run/1000/jupyter called emacs-mykernel.json.json, so it seems like ob-ipython isn't respecting conventions for connecting to existing kernels that is demonstrated in the link.

roblem commented 6 years ago

I am able to connect to a remote kernel (with dashes in the name as demonstrated above) using vanilla emacs and upstream ob-ipython. Going to do a diff on scimax ob-ipython and upstream to see if there is an easy fix.

jkitchin commented 6 years ago

It looks like the dash error raising got introduced in e84a7d4 for getting completion to work from the kernel. It is related to issue #89 where it was reported that session names with dashes were causing a hang.

An easy fix is just comment out the user error code and see if it works.

It sounds like I should also update my local setup to upstream and work out any differences. My development time on this has been pretty limited lately.

roblem commented 6 years ago

My intuition is that this will continue to create a new kernel, but I can try it out this weekend and report back.

I noticed the upstream python driver/client.py and ob-ipython.el files were doing checks in multiple places to not create a new kernel if the session name included .json. I suspect the code as it stands now (with the dash error fix reverted) will wrap whatever name you give it in emacs- and .json and create a new kernel as happened with my local kernel example above.

roblem commented 6 years ago

I can confirm that scimax is simply wrapping session name (even if session name is an existing kernel) with emacs- and json even if user-error if dashes in session name check is commented out in e84a7d4 around line 815.

Steps I used for testing:

  1. Create local ipython kernel with ipython kernel and the kernel kernel-8523 is created
  2. Checked contents of /run/user/1000/jupyter and kernel-8523.json exists there
  3. In scimax, ran this (with dash checking in session name commented out)
    #+NAME: friend-football-diet-mississippi
    #+BEGIN_SRC ipython :session kernel-8523.json
    x = 1
    print(x)
    #+END_SRC
  4. Execution fails with these messages:
    Code block evaluation complete.
    Position saved to mark ring, go back with ‘s-SPC’.
    Contacting host: localhost:9988
    Code block evaluation complete.
    error in process filter: ob-ipython--dump-error: There was a fatal error trying to process the request. See *ob-ipython-debug*
    error in process filter: There was a fatal error trying to process the request. See *ob-ipython-debug*
    Shell native completion is disabled, using fallback
  5. Re-checked contents of /run/user/1000/jupyter and both kernel-8523.json and emacs-kernel-8523.json.json exist, so a new kernel was started ignoring the existing one.
timehaven commented 6 years ago

I made a little progress on this issue and was able to connect to a remote kernel.

The ob-ipython that is a git submodule in scimax is pretty old. Much has changed. @gregsexton has made much recent progress on ob-ipython,

https://github.com/gregsexton/ob-ipython

He has updated it to aim for conforming with jupyter, which is a huge deal in the emacs/org/jupyter "replace notebooks" movement. He has removed driver.py. He even has specific instructions on connecting to remote jupyter kernels.

Specifically for scimax, to get something working, I did the following:

cd /path/to/scimax
mv ob-ipython ob-ipython-orig
git clone https://github.com/gregsexton/ob-ipython.git

Then I went looking for references to driver. I ended up commenting out three entire functions that @jkitchin had re-written in scimax-org-babel-ipython.el:

-(defun ob-ipython--execute-request (code name)
...
+;; (defun ob-ipython--execute-request (code name)
...
-(defun ob-ipython--inspect-request (code &optional pos detail)
...
+;; (defun ob-ipython--inspect-request (code &optional pos detail)
...
-(defun ob-ipython--complete-request (code &optional pos)
...
+;; (defun ob-ipython--complete-request (code &optional pos)

(I am only showing the git diff of the first lines of the entire functions I commented out.)

Then to address the wrapping of emacs- and .json, I copy/pasted directly from the new ob-ipython code into scimax-org-babel-ipython.el:

+;; (defun ob-ipython--kernel-repl-cmd (name)
+;;   (list scimax-ipython-command "console" "--existing" (format "emacs-%s.json" name)))
+
+
+(defun ob-ipython--kernel-file (name)
+  (if (s-ends-with-p ".json" name)
+      name
+    (format "emacs-%s.json" name)))
+
 (defun ob-ipython--kernel-repl-cmd (name)
-  (list scimax-ipython-command "console" "--existing" (format "emacs-%s.json" name)))
+  (list ob-ipython-command "console" "--simple-prompt" "--existing"
+        (ob-ipython--kernel-file name)))
+

Perhaps only commenting out the scimax version of ob-ipython--kernel-repl-cmd would have been enough.

Once I did the above, I ran the following explicitly with M-x org-babel-execute-src-block:

#+BEGIN_SRC ipython :session kernel-32191-ssh.json
print(a, b)
#+END_SRC

#+RESULTS:
:RESULTS:
123985 4443

:END:

Those are the correct values I created in my terminal jupyter console when I connected to the remote kernel:

» jupyter console --existing kernel-32191.json --ssh dev2311
[ZMQTerminalIPythonApp] Forwarding connections to 127.0.0.1 via dev2311
[ZMQTerminalIPythonApp] To connect another client via this tunnel, use:
[ZMQTerminalIPythonApp] --existing kernel-32191-ssh.json
Jupyter console 5.2.0

Python 3.6.3 |Anaconda, Inc.| (default, Nov  9 2017, 00:19:18)
Type 'copyright', 'credits' or 'license' for more information
IPython 6.1.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: a = 123985

Notice that M-x org-ctrl-c-ctrl-c (or simply C-c C-c) still barfs with the message "the session name cannot contain a -".

Now, running M-x list-processes there is only one active process instead of the three (client, driver, Python) with the old ob-ipython:

Python:kerne... run     *Python:kern... /dev/ttys002 jupyter console --simple-prompt --existing kernel-32191-ssh.json

I am not submitting a PR because I am not sure what sort of ripple effect any of these (breaking?) changes would have but it sure seems like things are close if the latest ob-ipython could be updated in the submodule.

timehaven commented 6 years ago

Yes, the above works as advertised...and also introduces breaking changes that I think might be due to the loss of org-ctrl-c-ctrl-c: error tracebacks are gone, plots do not show up inline. I think it is due to the hack nature of my fix above and not properly making all the pipes work together.

Specifically, I think the following function (that is added as a hook) will need some surgery to deal with the new ob-ipython architecture (no driver):

(add-to-list 'org-ctrl-c-ctrl-c-hook 'scimax-execute-ipython-block)
jkitchin commented 6 years ago

This is great news. I will start looking into updating scimax to work with the new ob-ipython. It would be great to by in sync with that. Thanks for working on this!

timehaven commented 6 years ago

Was just replying to a phantom comment that quickly disappeared...I got your same error initially, then quit and restarted Emacs and your example worked.

jkitchin commented 6 years ago

ah. I had to update jupyter_client and jupyter_console to get it to work. There were a few Emacs restarts in there too, so who knows what made it finally work. It is a good start anyway. It looks like it could be a big job to clean up scimax to do this!

jkitchin commented 6 years ago

See issue #109. I have a branch that runs the upstream ob-ipython repo that may resolve this issue.

roblem commented 6 years ago

Tested against the new branch (mentioned in #109) and can confirm that remote execution is working with some basic examples using the instruction here: https://vxlabs.com/2017/11/30/run-code-on-remote-ipython-kernels-with-emacs-and-orgmode/ Just had to change the jupyter console command from jupyter console --existing kernel-12818.json --ssh username to jupyter console --existing kernel-12818.json --ssh username@host

I will use this over the next week or so and report back.

jkitchin commented 6 years ago

Thanks! I was able to test it out locally today, and I updated the notes on how to do it.

timehaven commented 6 years ago

Agreed--this works great now with your new branch.

@jkitchin already put the following in his doc but just to reiterate to those coming here, that the "-ssh" has been added to the session file you need to connect to.

In @roblem comment above, that example:

jupyter console --existing kernel-12818.json --ssh username@host

will result in a local file named kernel-12818-ssh.json so in an Emacs org src block, you would set session like so:

#+BEGIN_SRC ipython :session kernel-12818-ssh.json
print('Howdy.')
#+END_SRC

Also, setting a file property works so you do not have to specify :session in each src block:

#+PROPERTY: header-args:ipython :session kernel-12818-ssh.json

Great stuff!

jkitchin commented 6 years ago

Thanks for the note! I added the file property note just now.

roblem commented 6 years ago

Noticing that using jupyter console with, for example, jupyter console --existing kernel-12818.json --ssh username@host is a little flaky. Sometimes

  1. it hangs with no ipython prompt
  2. the ipython client prompt appears but is unresponsive
  3. after successfully getting a usable ipython prompt, when I quit the terminal console session with Cntrl-d it closes the kernel on host, which shouldn't happen (https://github.com/jupyter/jupyter_console/pull/127). This requires re instantiating the kernel on host, copying the json to client, etc.

These aren't scimax/emacs/ob-ipython problems per se, but they impact the workflow alot.

So, from the suggested link from upstream ob-ipython (https://github.com/ipython/ipython/wiki/Cookbook:-Connecting-to-a-remote-kernel-via-ssh in the manual ssh tunnels section), remote access can also be done this way:

  1. Setup kernel on host using known name (for convenience I like a fixed name): ipython kernel -f remote-ipython.json
  2. Copy to client and setup ssh forwarding for required ports (where /run/user/1000/jupyter/ is the runtime directory on client and host and my differ for other distros/platforms):
    scp host:/run/user/1000/jupyter/remote-ipython.json /run/user/1000/jupyter/remote-ipython.json
    for port in $(cat /run/user/1000/jupyter/remote-ipython.json | grep '_port' | grep -o '[0-9]\+'); do ssh host -f -N -L  $port:127.0.0.1:$port; done
  3. In org-mode, use :session remote-ipython.json
  4. Then when I am done, I manually delete the ssh forwards, but usually I leave them open for long periods of time

I have found this to be bullet proof over a few hours of normal work. It is great to have this working in Scimax!

timehaven commented 6 years ago

Nice tip @roblem , I did not know about -f remote-ipython.json.

Another note regarding properties and kernels...as noted at the bottom of the doc:

https://github.com/jkitchin/scimax/blob/ob-ipython-upstream/scimax-ipython.org

you can set a global, buffer-wide ipython session via a property:

#+PROPERTY: header-args:ipython :session kernel-16577-ssh.json

To get really fancy, you can have each heading of your org document be connected to separate kernels. It would be like having multiple Jupyter notebooks in a single .org file.

Use the heading level property and follow all the same ways of connecting to a remote kernel:

* Using a remote kernel
  :PROPERTIES:
  :header-args:ipython: :session bare-emacs-remote-1-ssh.json
  :END:

#+BEGIN_SRC ipython
print(a)
print(msg)
#+END_SRC

#+RESULTS:
:RESULTS:
# Out[6]:
# output
45
In and working!
:END:

* Using a different kernel with properties
  :PROPERTIES:
  :header-args:ipython: :session bare-emacs-remote-2-ssh.json
  :END:

#+BEGIN_SRC ipython
print(a)
print(msg)
#+END_SRC

#+RESULTS:
:RESULTS:
# Out[3]:
# output
98765
I am in bare-emacs-remote-2-ssh.json now!
:END:
roblem commented 6 years ago

The issues I was having in https://github.com/jkitchin/scimax/issues/114#issuecomment-365219262 was due to mixing and matching python 3.5 and 3.6 virtual environments on client and host (not a scimax or ob-ipython problem). Once I switched everything over to python 3.6, the jupyter console --existing approach described in the documentation is working as advertised.

As for scimax and remote kernels, I have been using it extensively over the past few days and it is very solid. This is a fantastic feature to have in scimax, so a big thanks!

Also, thanks for the header tip @timehaven.

jkitchin commented 6 years ago

commit f2726b6760dda899c75ceb76cf2dfe952adb6c88 should address this.