emacs-ess / ESS

Emacs Speaks Statistics: ESS
https://ess.r-project.org/
GNU General Public License v3.0
620 stars 162 forks source link

`source("....", echo=TRUE)` produces prompts wrongly in iESS `*R*` #956

Closed mmaechler closed 4 years ago

mmaechler commented 5 years ago

As the subject says: The current "prompt searching" and similar hoop jumping really kill the nice echo = TRUE behavior of source() by adding extra newlines the result of which is very ugly.

I have set ess-eval-visibly to (non default) nil .. but don't really expect it to make a difference

mmaechler commented 5 years ago

E.g

nf <- tempfile(); writeLines(unlist(lapply(5:10, function(i) paste0("1:",i))), nf)
source(nf, echo=TRUE)

gives

> nf <- tempfile(); writeLines(unlist(lapply(5:10, function(i) paste0("1:",i))), nf)
> source(nf, echo=TRUE)

> 
1:5
[1] 1 2 3 4 5

> 
1:6
[1] 1 2 3 4 5 6

> 
1:7
[1] 1 2 3 4 5 6 7

> 
1:8
[1] 1 2 3 4 5 6 7 8

> 
1:9
[1] 1 2 3 4 5 6 7 8 9

> 
1:10
 [1]  1  2  3  4  5  6  7  8  9 10
> 

but looks correct and nice e.g. if run in R started in an emacs *shell* buffer.

mmaechler commented 4 years ago

This is really really a pain, for me as an R developer.... it is relatively new, definitely not in the last released version of ESS. I cannot release ESS if this issue is not solved.

lionel- commented 4 years ago

I still wonder if it'd be worth it to explore using the R parser to figure out complete expressions.

split_complete <- function(code) {
  out <- character()

  # The splitting would be performed with Emacs lisp based on the R
  # syntax table. This would take care of `;` and `\n` inside strings etc.
  lines <- strsplit(code, "\\n|;")[[1]]

  while (length(lines)) {
    parsed <- try(parse(text = lines[[1]]), silent = TRUE)

    if (inherits(parsed, "try-error") && length(lines) > 1) {
      lines[[1]] <- paste(lines[[1]], lines[[2]], sep = "\n")
      lines <- lines[-2]
      next
    }

    out <- c(out, lines[[1]])
    lines <- lines[-1]
  }

  out
}

code1 <- "1 +
  +2
+3
"
code2 <- "1 +
  +2 +
+3
"
code3 <- "1 +
  +2
+3+
"
code4 <- "1 +
  _2
  3
"
code5 <- "1 +
  2
  _3
"

# Two complete expressions
split_complete(code1)
#> [1] "1 +\n  +2" "+3"

# One complete expression
split_complete(code2)
#> [1] "1 +\n  +2 +\n+3"

# Two complete expressions, the third element is incomplete
split_complete(code3)
#> [1] "1 +\n  +2" "+3+"

# Zero complete expression, the single element causes a parser failure
split_complete(code4)
#> [1] "1 +\n  _2\n  3"

# One complete expression, the second element causes a parser failure
split_complete(code5)
#> [1] "1 +\n  2" "  _3"

See discussion with Alex in https://github.com/emacs-ess/ESS/issues/831#issuecomment-461811552

lionel- commented 4 years ago

@vspinu Sorry if the problem of detecting complete expressions is not relevant to the issue of prompt detection. I seem to recall you once mentioned it is connected. Or maybe the issue of prompt detection is not relevant to the parasite newlines in output? How this all holds together is very blurry to me.

There's also this that might be relevant: https://github.com/emacs-ess/ESS/issues/759#issuecomment-440668598

I agree with Martin that the output situation is not good: newlines randomly popping up, tibble output is scrambled, etc. I think it'd be worth delaying the release even for 6 months if this means we come up with a 100% solution for the years to come.

vspinu commented 4 years ago

@lionel- the core of the issues is that we have a lot of ways to send input (ess-eval-visibly toggle), the most problematic being the accumulation (arguably being the most useful one ATM). Other Emacs IDEs that I have used don't have this option, they always send complete expressions.

still wonder if it'd be worth it to explore using the R parser to figure out complete expressions.

The OP and other issues are output issues, It comes from the fact that just form the output of the process we cannot reliably differentiate where the input ended and where the actual output started. As R itself doesn't insert a new line after those >>>> we have to insert it and for that we use a bunch of heuristics because one can never know for sure what is what. If R itself can start inserting new lines where it should we can remove all those hacks. To summarize, IMO the biggest chance of fixing this is through:

Back to your proposal. It is possible, but we would need to spark a new R process for every evaluation and that would still leave the corner case when a person works on a remote machine and doesn't have a local R installation, but ok.

Note that being able to capture expressions would help only if we decide to move to a non-accumulation evaluatio mechanism, that is detect the entire expression and send it at-once. So, no partial eval functions like ess-eval-line-and-step. This would be a major breaking change which would affect ALL users. By far the biggest change we ever made.

vspinu commented 4 years ago

@mmaechler I don't think this is such a big issue. I understand that for teaching purposes it could be useful, but then there are much better options out there (like reprex).

Please keep in mind that this issue cannot be technically solved on our side (even if we change to custom prompt). One has to choose either to allow for source(... echo = TRUE) to work or have headers of the data.frames aligned on a new line (like in print(iris)). Our process filter doesn't know that > 1:5 is an input, and not a misaligned (by R) 1:5 output. I think the choice is obvious here.

mmaechler commented 4 years ago

I don't agree at all: It is a big issue for people who use and teach base R. It is also a big issue if you need to do (almost) exactly the same as R does, in R CMD check, when running pkg-Ex.R because sometimes you really need to run the whole stuff to get the same behavior as in R CMD check.

Also this has worked correctly "forever" (till 2017-11 ?) in ESS...

What I find puzzling that in the same running emacs, in some of my several *R* buffers, I do observe this bug, and in others I don't, also when running the exact same version of R with identically looking sessionInfo() .. So it seems as there must be something to prevent this horrific behavior ...

lionel- commented 4 years ago

@vspinu Thanks for the explanation. Can you provide an example that causes those > > > lines?

By the way, the way to fix prompt detection in R, I suspect, is to add getOption("ide.prompt") which could be set to a size 2 character vector. The first element would be displayed before getOption("prompt"), and the second right after. This way we can reliably detect and filter prompts and user inputs, without interfering with the user prompt or making any assumption about it (and it can then be set to anything). source() could potentially use this as well.

Back to your proposal. It is possible, but we would need to spark a new R process for every evaluation and that would still leave the corner case when a person works on a remote machine and doesn't have a local R installation, but ok.

I think we could share a single evaluation process. Or we could have a utility-process associated to each REPL, this way we can xref-jump or get completion even when R is busy. The srcrefs might become stale after reinstalling packages etc, but the utility process would be reloaded whenever the REPL is reloaded. For completion this would only be a fallback mecahnism when R is busy, this way we still get completion depending on the state of the REPL (user-defined objects and functions, browser(), ...). For evaluation, it's a bit dangerous to rely on an external process (e.g. if there's a bug somewhere that causes the process to be in an unexpected state). That's why we'd need a defensive mechanism that marks the utility process as broken in case of problems (including no local R installation) and ESS would use simpler fallback mechanisms in that case.

Note that being able to capture expressions would help only if we decide to move to a non-accumulation evaluatio mechanism, that is detect the entire expression and send it at-once. So, no partial eval functions like ess-eval-line-and-step

I don't follow. What precisely is the accumulation mechanism by the way? After all these years of being an ESS user and developer, I'm still deeply confused about it all :(

Regarding eval-line-and-step, it seems alright to send inputs to R directly - complete expressions would be more useful for paragraph evaluation to select out incomplete parts. But if complete expressions are not relevant at all to the issue at hand (prompt and input detection in the output), then we can discuss it later.

vspinu commented 4 years ago

Can you provide an example that causes those > > > lines?

If you set ess-eval-visibly, any input to the process will produce those lines. Or . + > . + > more precisely because we replace long > > > with . +.

I suspect, is to add getOption("ide.prompt") which could be set to a size 2 character vector.

Yes. Can be, but that would mean tracking user prompt option on every evaluation command and and somewhat more involved logic on the elisp side. For starters we could just enforce our special prompt, say ^L> or something and see if it works. In the filter we just strip all those ^L.

I think we could share a single evaluation process.

This could be a good way to go. We can even share that utility process across the entire emas session (unique up to R version though). This will be a bit tricky with remote processes though, but ok.

What precisely is the accumulation mechanism by the way?

Accumulation is when you send an expression in several chunks, each piece is incomplete, but R waits for it till it is completed. Right now you can do C-c C-c multiple times and the input will be accumulated. Allow emacs IDS don't allows such accumulation.

This evaluation issue is loosely related to the OP and the prompt detection story. When I said in my earlier post that changing the evaluation scheme might help, I meant that if we know for sure what we send then we can distinguish between input and output. That statement is wrong. We could never know the code input as the example of source illustrates.

vspinu commented 4 years ago

That statement is wrong.

Actually not quite. I finally recalled that I had in mind all these years. If we could reliably distinguish incomplete input then we can accumulate on ESS side and send only complete input to R. This way we could use source(..., prompt.echo = "[ESS-prompt]") and solve this issue and other pending issues (e.g. inject references on every eval, not just functions or buffers).

lionel- commented 4 years ago

Thanks for the explanations.

Yes. Can be, but that would mean tracking user prompt option on every evaluation command and and somewhat more involved logic on the elisp side.

I thought it would be the other way around, we'd no longer need to track the user prompt which can be set to anything? We'd only need to track our own prompt tags, strip them from the output, and mark-up the range between the tags as prompt. We'd have 100% correctness in detection of prompts, making prompt navigation etc much more pleasant. An ide.continue option might also be useful, this would be another way to detect we sent an incomplete expression.

As for using a side R process to detect complete expressions, I did some work on that already, but unfortunately I can't work much on ESS until mid-April, after I (finally) submit my thesis. If you think the ide.prompt and ide.continue options are worth it, I could try and prepare patches for R though, that should be a small task. It seems this would be generally useful for IDEs that interact with R through pipes rather than with native callbacks. Right now there's a feature imbalance between these two types of process interaction.

vspinu commented 4 years ago

We'd only need to track our own prompt tags, strip them from the output, and mark-up the range between the tags as prompt.

By "tracking" I meant track at R level, as part of our evaluation mechanism we inspect the prompt, wrap it up and never unwrap. With extra care not to wrapped our own wrapped prompt. Not a big deal, but I would prefer to start it simple. Once we are sure that this technique works, we can think of allowing arbitrary prompt.

If you think the ide.prompt and ide.continue options are worth it, I could try and prepare patches for R though, that should be a small task.

I am not entirely following your proposal though. How this would help with the OP issue?

If you have time to patch R, then it's better to find a way to insert an input/output separator. At least a new line after those > > >. If we have an input/output separator we will be done with many-many issues, including distinct font-locking of input/output.

mmaechler commented 4 years ago

this may all be fine, sometimes in the future. Note the OP (me), is making a fuss -- duly -- about current ESS behavior with current R. This was working correctly for 95% of ESS' history with current (and previous) versions of R. ESS does too much of "massaging" around R's input and output, currently. I would really like a version of ess-eval-visibly or a new such variable that I could set to 'none (or nil) and then have NO ESS-fiddling of R : In that case I'd like the R buffer show output and input as the simple R console does, or more specifically as if I use M-x shell in Emacs and then start R in that emacs *shell* buffer.

vspinu commented 4 years ago

All of the massaging happens in the tracebug filter. So if you do M-x ess-toggle-tracebug you should see the plain output I think.

mmaechler commented 4 years ago

Ha! Thank you, indeed! Turning off this tracebuggy behavior is possible via
M-x ess-toggle-tracebug --- you made me really happy!