Tagged prompt detection

lionel- commented 4 years ago

Prompt detection is needed for these tasks:

Navigation and font-locking
Set busy status of process
Post processing output:
- Remove successive continuation prompts: > + + + >
- Add newline after intermediate prompts.

The continuation prompts are displayed when incomplete expressions are sent to R. When long paragraphs are evaluated, this results in long lines of + which we would like to strip from the output.

The intermediate prompts are displayed when multiple complete expressions are evaluated. They separate the outputs of different expressions separated by newline. Expressions separated by semi-colons do not get an intermediate prompt. These intermediate prompts are annoying because they cause output to be misaligned by 2 characters, which is especially problematic when the first line of output are column names.

With this input:

R outputs two continuation prompts and two intermediate prompts:

+ + [1] 1
> [1] 3
[1] 4
> [1] 5
>

Prompt detection is important to ESS but it is tricky to get right. Prompt massaging is even trickier and we've gotten extra newlines in outputs lately. Also, the code is getting quite complex. So I'd like to propose a more robust way of detecting prompts, and simpler post-processing behaviour.

I think the way to solve prompt detection it to mark the R prompts with an unlikely sequence of ANSI escapes. We'll use that sequence to robustly detect the prompts created by the REPL. Using ANSI escapes to mark the prompts reduces the chance of leaking the markers where they shouldn't. In most cases, the escapes will be processed out of the output. TRAMP uses a similar strategy for detecting the echo of commands, see tramp-echo-mark.

In addition to tagged prompts, I think we should simplify the post-processing.

Remove continuation prompts without replacement, unless they are trailing. The current replacement isn't very helpful, and instead of + . + output I'd rather just see output. I'd prefer to keep things simple, but no strong feeling about keeping the current behaviour and making it customisable though, if you prefer the . thing.
Even if we can detect intermediate prompts robustly, it's not possible to reliably detect when a new line should be inserted. It makes sense when a data frame is printed but not when a string or number is printed. Maybe we should only insert a newline when the output is multiline? However that won't solve the source(echo = TRUE) issue, so maybe we should stop trying to insert newlines?

It feels like intermediate prompts should be fixed in R itself since they are a general problem. Maybe there should be an option to echo the current expression after intermediate prompts, insert a newline, and only then the output. Command echo doesn't make sense for the first expression, but they are a good reminder for subsequent expressions, especially when output is long.

As a proof of concept I've implemented tagged prompts and simplified post-processing in https://github.com/lionel-/ESS/tree/fix-prompt. Preliminary tests shows the approach seems to work well. A nice benefit is that this unifies the nowait and nil code paths. The only difference between the two is echoing of commands.

lionel- commented 4 years ago

Forgot to mention that another reason for simplifying the output by completely stripping the non-final continuation prompts away (instead of replacing them with + . +) is that it simplifies the user interface: we only show the continuation prompt in the REPL when the user actually has to complete input.

It also reduces the issue of misalignment for intermediate prompts: > is 2 characters, > + . + is 7.

vspinu commented 4 years ago

Does accumulation work well with this? I would say you test it locally for a few weeks and then we go with it if. I don't think we will find better idea than this.

I would say tat we reset the prompt as frequent as we can, in places like load file and completion maybe.

jabranham commented 4 years ago

@lionel- sounds promising! Let me/us know when you want some code review. Or is it ready now?

lionel- commented 4 years ago

oh not ready yet! I wanted to get your feedback before investing further in this approach.

@ vspinu AFAIU accumulation works. What do you mean by resetting the prompt?

vspinu commented 4 years ago

You set the promt once per session right? If user changes it we need to reset it back.

lionel- commented 4 years ago

Yeah that will require some thought. Is the custom prompt global or inferior-local? It should probably be set with a function rather than via variable.

lionel- commented 4 years ago

From @mmaechler:

sink("table-ex.Rout"); example(table); sink()

table�[39m�[23m> require(stats) # for rpois and xtabs

table�[39m�[23m> ## Simple frequency distribution
table�[39m�[23m> table(rpois(100, 5))

 0  1  2  3  4  5  6  7  8  9 10 
 1  3 10 17 22 10 10  8 11  4  4 

table�[39m�[23m> ## Check the design:

This is clearly not ideal, even if .Rout files in Emacs should support ANSI escapes. To me this suggests the ideal solution is to have a pair of IDE-oriented prompt options so that R itself can insert the tags when relevant. In particular, it should probably not add the tags unless the R console is connected to stdout, since the only reason for using the tags is to communicate with the IDE.

I think it's still a good idea to (a) limit prompt post-processing when the prompt isn't tagged and (b) experiment with tagged prompts to discover the pros and cons of the approach. Probably tagged prompts shouldn't be the default in released versions of ESS. Once we have gained experience, it will be clearer what we need to do to improve the approach, possibly in the form of a patch for R.

What do you all think?

mmaechler commented 4 years ago

@lionel- one thing you mentioned above also occured to me after finding the above and mentioning that options(prompt = .) really belongs to the user (or his/her R scripts/code) and not to any IDE:

It may be a good idea to think of extensions to R which allow something like options(prompt.interactive = pp) where pp could be a string but more importantly could be a function(prompt) {..} which would "enhance" / "tag" the getOption("prompt") prompt and be used by R only if(interactive())

vspinu commented 4 years ago

@lionel- could you please move the dev branch to ESS and PR? So we can discuss more concretely and try easier.

Two ideas regarding sink, first advice it to inform ESS that prompt is no longer tagged. Second, detect that sink is in place in our eval wrappers and don't install the prompt. Not sure how the second would work as I don't know the implementation yet.

Regarding user vs IDE prompt, we are still keeping the user prompt, we are just wrapping it with escape characters and removing them in the output. Given this transparency, the prompt still belongs to the user.

mmaechler commented 4 years ago

Regarding user vs IDE prompt, we are still keeping the user prompt, we are just wrapping it with escape characters and removing them in the output. Given this transparency, the prompt still belongs to the use

I don't think I agree: If the user wants to do something with the current prompt themselves in their own I/O functions, they get the tags, no? {I can't quickly answer myself as I've switched back from fix-prompt}

vspinu commented 4 years ago

It depends I guess. If they control the prompt then it's not an issue. for instance if the tool replaces the prompt for some output processing inside a function (like roxygen for example) it's entirely within our evaluation wrapper, so it should be fine. Otherwise, if the tool just picks the promt (like sink) then it's a problem indeed. But what's there beside sink that would like to do that?

In any case I think those use cases can be dealt case by case, with wrappers or otherwise. Small price to pay for the benefits it brings.

emacs-ess / ESS

Tagged prompt detection #1008