Consideration for ANSI output over IPC / "next-gen" remote debugging

goodboy commented 4 years ago

Hey @jonathanslenders :smile_cat:

I was wondering what you thought about the idea of formalizing an API / configuration that would make it possible run ptk in a remote process and ship output over an IPC channel such that native remote debugging with things like coolio tab completion and syntax highlighting can be a thing; bringing the wonder of Python's interactivity to distributed systems dev.

I've got a couple issue written up that cover my research up to this point:

goodboy/tractor#130
goodboy/tractor#113

As far as I gather ptk is pretty much the project (even outside python) capable of this feat and so I thought I'd start here :)

I'll requote:

The problem

Standard fancy (read human enhanced) debugger repls (including the stdlib's pdb which uses rlcompleter, and pdb++) rely on libraries such as GNU readline to get things like completion and CLI "editting controls". There seems to be no way to get these features with readline based systems in a remote debugging context since Python's use of readline requires that the process is launched under a tty/pty system. Ideally these features are available in such use cases to make debugging of remote systems sane and efficient for the user.

I'll re-summarize a few things we've discovered poking around these repos and which we've already listed in the above issues:

options for doing this in a tmux-like app: prompt-toolkit/python-prompt-toolkit#1087

mention about using IPC to ship ANSI output

further discussion around capturing the output

a PR showing how to reroute output to stderr if it's a tty

there seems to be test code suggesting ptk can work without being connected to a tty/pty.

there's a utility for avoiding the clobbering of stdout that looks to be exactly what I had envisioned for a minimal control repl

a very old pdb implementation built on ptpython, ptpdb which is likely where we'll need to start

ptk has utils for not clobbering stdout which would be super handy for having multiple actors logging while your in the middle of debugging a crash.

Would be greatly interested in what other things I've missed and any further input you might have. Also I can add the explicit issue/repo links to that list if this issue is something you deem worth tracking.

Also a couple things wrt projects built on top of ptk (or wanting to support it):

in theory this might work with ipdb (since ipython uses ptk underneath) if ptk was configured properly (it currently doesn't work any better then the readline options based on my testing in https://github.com/goodboy/tractor/tree/stin_char_relay).

work with debugger's that want to move to ptk to get this functionality supported in their initial integration such as with pdbpp in pdbpp/pdbpp#362

pdb++ discussion on better support for remote debugging

Also fwiw I have tried out ptpdb and I think it's super slick but, it might just be a bit too much as a default, with the full terminal UI and all (though having that kind of full UI glory is definitely nice to have if the user so chooses).

One of the critical UX things I'm hoping for is the ability to work with a prompt while other asynchronous things are writing to the local tty and not having that prompt get clobbered. I pointed above to a couple things above but I'm hoping you can give me even more pointers on how to accomplish this :detective:.

Look forward to your thoughts :surfer:

jonathanslenders commented 4 years ago

Hi @goodboy,

Much of this is possible already (much has been improved very recently in this area). We have {Win32,Posix}PipeInput classes now and Vt100_Output which can write into any TextIO object.

If no Output or Input is passed to the Application, then prompt_toolkit takes the output/input from the AppSession: https://github.com/prompt-toolkit/python-prompt-toolkit/blob/master/prompt_toolkit/application/current.py#L26 An AppSession can be used as a context manager with uses contextvars. So, all applications running within there will use these input/output devices.

Prompt_toolkit ships with helpers for SSH and Telnet servers. So, we can expose a prompt_toolkit application over an SSH or Telnet port. The SSH and Telnet helpers will ensure that new AppSessions are created for incoming connections, so that input/output is routed correctly from the user to the application.

A beautiful example that exposes ptpython over both SSH and telnet, which embedding it in an application is this one: https://github.com/prompt-toolkit/ptpython/blob/master/examples/ssh-and-telnet-embed.py (Thank to @vxgmichel ).

I haven't been using ptpdb recently, but it could be possible as well. Important here is that the application runs asynchronously, because we can't have nested event loops, and the SSH/Telnet server already spawns a loop.

goodboy commented 4 years ago

@jonathanslenders oh, this all looks great :partying_face:; I'm glad I asked!

The AppSession stuff looks to be about what I need to start on the root-parent-process/supervisor (aka) side. I think it'll probably require some digging through internals of ptpdb as well to see what can be updated to this (new?) api.

RE: the ssh/telnet stuff I see that the async stuff is tied to asyncio but, ptk core is not (yet) coupled to asyncio.

The project I'm trying to support this in (a so called "structured concurrent actor system") is actually built on trio and I'm wondering if we'll have to look at adapters (of which we have plenty) for the ptk core event loop?

Some notes from me:

I see the create_pipe_input() and StdOut shim and how it's all tied together in the asyncssh contrib script
the callbacks all seems to be sync and so shouldn't be a problem to implement
ideally we can run the (remote) debugger on top of the trio scheduler so as to avoid thread synchronization and/or multiple loops/async backends:
- it's not clear to me after (very briefly) reading the EventLoop interface if we need to implement a trio loop implementation or if we can just run in guest mode; the latter case I'm not sure is super ideal either since we'll be stepping through both loops (trio and ptk) in the debugger?

jonathanslenders commented 4 years ago

Hi @goodboy,

I think we have to fix ptpdb, and make sure it uses prompt_toolkit 3.0. Then I guess you can use an asyncio adaptor to run it on trio.

vxgmichel commented 4 years ago

Since I've been mentioned, I though I'd drop by :grin:

@goodboy

RE: the ssh/telnet stuff I see that the async stuff is tied to asyncio but, ptk core is not (yet) coupled to asyncio. [...] it's not clear to me after (very briefly) reading the EventLoop interface if we need to implement a trio loop

It seems like you're referring to a commit in prompt-toolkit 2.0, but prompt-toolkit 3.0 actually uses asyncio natively.

That means prompt-toolkit 3.0 might just work with trio-asyncio, although I've never tried that.

goodboy commented 4 years ago

but prompt-toolkit 3.0 actually uses asyncio natively.

Ahh very gtk, thx @vxgmichel.

I'm actually thinking though now it shouldn't matter too much since from our perspective we'll be running the debugger as a sync app and shouldn't care about how ptk is implemented underneath. The only concern is how to get output back to the appropriate parent process for relay to the local tty. The IPC part should be entirely out of thread of the debugged trio/tractor program to avoid mutation of the task stack while the debugger is attached.

I'm thinking a background thread to do the output sending over IPC can be done as usual with trio/tractor and we'll have to figure out a thread safe and/or synchronous way to move the output to the second thread from ptk/asyncio. This needs much more investigation and tinkering but the main thing is to keep the target process looking like it would run without any debugger attached.

I very much appreciate the tips!

jonathanslenders commented 4 years ago

FYI: about ptpdb, I've been looking into upgrading this to prompt_toolkit 3.0. Unfortunately this is old code, that hasn't been maintained for a long time. I started the work in the following PR, but there's a lot to be done and I wonder whether it's worth the effort. I did underestimate that.

https://github.com/jonathanslenders/ptpdb/pull/21

Maybe, at some point I'll continue on this, but I can promise anything yet.

goodboy commented 4 years ago

I did underestimate that.

@jonathanslenders well I'm glad you tried first :smile: before I got lost in it.

Alternatively I know there was mention of pdbpp considering porting to ptk in pdbpp/pdbpp#362. Maybe that's an avenue worth investigating as well.

goodboy commented 4 years ago

Oh, I was also going to say another alternative might just be to use ipdb which is just exporting the debugger from IPython. I would assume it's already on 3.0?

Though I guess this may mean hacking on ipython core to accomplish this? Maybe someone already has already thought about this re: ipython.parallel (which i guess is now ipyparallel?).

goodboy commented 4 years ago

Dug into ipython (very very) briefly and found where the ptk object entrypoints are:

instantiating a completer wrapper inside the terminal debugger which is a subtype of prompt_toolkit.completion.Completer
creation of a PromptSession which is a prompt_toolkit.shortcuts.prompt.PromptSession
- no idea how this differs from the AppSession mentioned by @jonathanslenders above (yet)

jonathanslenders commented 4 years ago

@goodboy : A PromptSession can run in an AppSession. The PromptSession is the object that keeps track of the input history during several input queries, as well as copy/paste registries (and maybe a few other things). The AppSession provides an environment with input/output objects in which all applications can run.

prompt-toolkit / python-prompt-toolkit

Consideration for ANSI output over IPC / "next-gen" remote debugging #1204

The problem