This is an invitation to discussion. There is no need to implement this feature, but I do believe the end users of {extendr} could benefit from something like this.
In a user session, when R has access to stdout and stderr, cargo displays its output in the console (unless quiet = TRUE).
This helps to resolve any compilation errors.
What happens if no stdout/stderr is available or if quiet = TRUE? Well, nothing is printed and the only useful information is the error message saying "Compilation failed. Aborting". This is especially painful when running R CMD check or rcmdcheck::rcmdcheck, as it shows you errors but not stdout/stderr.
Proposed solution
As suggested by dfalbel in Discord discussion, we use rlang::abort() to produce an rlang::rlang_error when calling rextendr::ui_throw().
rlang::abort()allows attaching additional named data to the thrown error using its ... argument.
So, we could augment the invocation of cargo, capture its errors and attach them to our rextendr::rextendr_error.
This way, even if we disable verbose output for cargo, we will be able to provide an explanation to the user of what went wrong.
Sounds simple? Well, not really, here are the details.
Implementation details
Capturing cargo output
The desired solution is to be able to bothdisplay and capture output of cargo, preserving its color scheme and formatting. We would also want to be able to 'switch off' verbose output yet still capture it for the purpose of error formatting, satisfying quiet = TRUE.
cargo is a little bit tricky. It prints to stdout information like passed/failed tests, but sends all compilation info, warnings and errors to stderr (this includes all these fancy 'Updating crates.io', 'Compiling x', 'Finished', etc messages).
In our scenario, we run cargo build --lib, so there should be no stdout at all, but for now let us assume there may be stdout during compilation. Just a reminder that we are discussing runtime compilation, where we create Rust crate ourselves, so I expect no build.rs or any other build tricks, just plain compilation of (likely one) Rust file(s).
Problem: system2 cannot simultaneously print and capture output to a variable or file.
Solution: cargo is a separate executable, so let's run it properly: processx::run(). We depend on {processx} through {callr}, which is used for out-of-process wrapper generation. The processx::run() allows to both capture and print out stdout and stderr, separately. If configured correctly, it will behave as system2(), but the returned value will contain not only $status code, but also $stdout and $stderr.
Using additional parameters like echo and echo_cmd, we can control all of the printed output.
Instead of passing around weird stdout/stderr variables with obscure values of ""/NULL, we can have one parameter quiet = logical(1), in line with other functions, and switch off all printed output based on the value of this flag.
Changes: Remove stdout and stderr from rextendr:::invoke_cargo(), add a single quiet argument. Replace call to system2() with call to processx::run() or callr::run() (a re-exported function), adjust parameters, record stderr in a separate variable.
Drawbacks:
When capturing stderr, on Windows the stdout/stderr interleaving may break, likely caused by stdout buffering. However, because cargo does not write to stdout that much in our scenarios, this problem is unobserved. No remedy is available.
When writing and capturing stderr, processx::run() prints child process' stderr to R's stdout. As a result, when using {knitr}, compilation information leaks to the captured output.
Temporary fix: execute {knitr} with quiet = TRUE.
Permanent fix: revise rextendr::rust_eval(), which compiles and runs code fragment in one go. Suggested solution: make rust_eval_fun() which compiles fragment and returns R wrapper function, which allows to separately capture compilation stdout and any stdout printed by Rust snippet. Can be implemented together with improving {rextendr} {knitr} engine.
Problem: cargo stdout is unstructured. Multiple errors can be emitted alongside warnings. The same stream contains information about successful compilation, which should be excluded when capturing error messages.
Solution: The output is pretty straightforward. It can be concatenated into a single string. We can match lines on "\nwarning:|\nerror:" or something similar. Using this pattern we can split the input into (multi-line) substrings and trim extra spaces from both sides. If matched correctly, then all strings starting with "^error:" will contain information about strictly one error. Same applies to "^warning:".
Collected errors and warnings can be sent along rextendr::rextendr_error as cargo_errors = list(errors = errors, warnings = warnings).
Drawbacks:
Parsing generally unknown and unstructured output which may change in future versions of Rust. In the worst-case scenario, we won't get any useful information from stderr, which is what we have right now (so no regression). In the best-case scenario, we will capture each error and warning separately, ignoring all garbage about successful compilations. This postprocessing only happens when cargo fails, so there should be at least one error in the output. Even if we grab everything in the stderr, it is still better than what we have right now.
{cli} does not support advanced ansi-aware regex, so for now we first strip all of the output of ansi sequences and process plain text.
Permanent fix: Investigate how to utilize {cli} ansi-aware regex and possibly simplify expressions, which may allow us to include cargo error messages in our rextendr::rextendr_error, preserving all formatting, including colors.
Printing rextendr_error
In general, {rlang} does not display additional fields and metadata when printed rlang::rlang_error-derived errors. What we would like is to have two display modes: a shorter form which prints only part of errors/warnings, which is used everywhere (especially when the error is uncaught), and a longer form which prints all of the cargo errors/warnings, which should be accessed using summaryS3 method.
Problem: To print rextendr::rextendr_error in short form, we need to be able to carefully wrap error messages and subset n lines from each error message, preserving format. Otherwise, we will lose useful information like where in the source code the error occurred. It is desirable to offload as much formatting as possible to {rlang} or any other base-type methods.
conditionMessage is invoked by {rlang} when formatting its body. This method provides the output of the base class plus additional information about cargo errors in short form. This implementation automatically propagates to other printing methods.
format is invoked when formatting, e.g., for summary method. We can use one of the {rlang} parameters to determine if the output should be simplified or detailed. When detailed, we output cargo errors in the long form. This automatically enables summary() to print detailed information.
Drawbacks:
Due to {rlang}'s implementation of formatting methods, calling NextMethod() in format() results in an infninte recursion. Temporary fix: In rextendr::format.rextendr_error temporarily strip error object of all "rextendr_*" classes and then dispatch S3, which will then resolve into correct {rlang} generics, avoiding infinite recursion. This, however, incorrectly prints the error type.Fix: Limitations are avoided using options.
Wrapping and printing out error messages can be tricky. We want to preserve the structure and (possibly) theme.
Temporary fix: we handle plain text only, wrap lines preserving original line breaks as well.
Permanent fix: we use ansi-aware procedure to process original error messages, preserving full formatting.
Before you read
This is an invitation to discussion. There is no need to implement this feature, but I do believe the end users of {extendr} could benefit from something like this.
Problem outline
We run
cargo
in two scenarios: when a user uses public API likerust_function
and when {knitr} is processing markdown. https://github.com/extendr/rextendr/blob/00ce4650ee044e4b7c66ebeaecaa65c6d5ec7024/R/source.R#L281-L297 As you can see, right now we simply usesystem2
with somestderr
/stdout
redirection and check its exit status. If not zero, we throw a super-useful error with absolutely no details.In a user session, when R has access to
stdout
andstderr
,cargo
displays its output in the console (unlessquiet = TRUE
). This helps to resolve any compilation errors. What happens if nostdout
/stderr
is available or ifquiet = TRUE
? Well, nothing is printed and the only useful information is the error message saying"Compilation failed. Aborting"
. This is especially painful when runningR CMD check
orrcmdcheck::rcmdcheck
, as it shows you errors but notstdout
/stderr
.Proposed solution
As suggested by dfalbel in Discord discussion, we use
rlang::abort()
to produce anrlang::rlang_error
when callingrextendr::ui_throw()
.rlang::abort()
allows attaching additional named data to the thrown error using its...
argument. So, we could augment the invocation ofcargo
, capture its errors and attach them to ourrextendr::rextendr_error
. This way, even if we disable verbose output forcargo
, we will be able to provide an explanation to the user of what went wrong. Sounds simple? Well, not really, here are the details.Implementation details
Capturing
cargo
outputThe desired solution is to be able to both display and capture output of
cargo
, preserving its color scheme and formatting. We would also want to be able to 'switch off' verbose output yet still capture it for the purpose of error formatting, satisfyingquiet = TRUE
.cargo
is a little bit tricky. It prints tostdout
information like passed/failed tests, but sends all compilation info, warnings and errors tostderr
(this includes all these fancy'Updating crates.io
','Compiling x'
,'Finished'
, etc messages). In our scenario, we runcargo build --lib
, so there should be nostdout
at all, but for now let us assume there may bestdout
during compilation. Just a reminder that we are discussing runtime compilation, where we create Rust crate ourselves, so I expect nobuild.rs
or any other build tricks, just plain compilation of (likely one) Rust file(s).Problem:
system2
cannot simultaneously print and capture output to a variable or file.Solution:
cargo
is a separate executable, so let's run it properly:processx::run()
. We depend on {processx} through {callr}, which is used for out-of-process wrapper generation. Theprocessx::run()
allows to both capture and print outstdout
andstderr
, separately. If configured correctly, it will behave assystem2()
, but the returned value will contain not only$status
code, but also$stdout
and$stderr
. Using additional parameters likeecho
andecho_cmd
, we can control all of the printed output. Instead of passing around weirdstdout
/stderr
variables with obscure values of""
/NULL
, we can have one parameterquiet = logical(1)
, in line with other functions, and switch off all printed output based on the value of this flag.Changes: Remove
stdout
andstderr
fromrextendr:::invoke_cargo()
, add a singlequiet
argument. Replace call tosystem2()
with call toprocessx::run()
orcallr::run()
(a re-exported function), adjust parameters, recordstderr
in a separate variable.Drawbacks:
stderr
, on Windows thestdout
/stderr
interleaving may break, likely caused bystdout
buffering. However, becausecargo
does not write tostdout
that much in our scenarios, this problem is unobserved. No remedy is available.stderr
,processx::run()
prints child process'stderr
to R'sstdout
. As a result, when using{knitr}
, compilation information leaks to the captured output. Temporary fix: execute {knitr} withquiet = TRUE
. Permanent fix: reviserextendr::rust_eval()
, which compiles and runs code fragment in one go. Suggested solution: makerust_eval_fun()
which compiles fragment and returns R wrapper function, which allows to separately capture compilationstdout
and anystdout
printed by Rust snippet. Can be implemented together with improving {rextendr} {knitr} engine.Problem:
cargo
stdout is unstructured. Multiple errors can be emitted alongside warnings. The same stream contains information about successful compilation, which should be excluded when capturing error messages.Solution: The output is pretty straightforward. It can be concatenated into a single string. We can match lines on
"\nwarning:|\nerror:"
or something similar. Using this pattern we can split the input into (multi-line) substrings and trim extra spaces from both sides. If matched correctly, then all strings starting with"^error:"
will contain information about strictly one error. Same applies to"^warning:"
. Collectederrors
andwarnings
can be sent alongrextendr::rextendr_error
ascargo_errors = list(errors = errors, warnings = warnings)
.Drawbacks:
stderr
, which is what we have right now (so no regression). In the best-case scenario, we will capture each error and warning separately, ignoring all garbage about successful compilations. This postprocessing only happens whencargo
fails, so there should be at least one error in the output. Even if we grab everything in thestderr
, it is still better than what we have right now.cargo
error messages in ourrextendr::rextendr_error
, preserving all formatting, including colors.Printing
rextendr_error
In general, {rlang} does not display additional fields and metadata when printed
rlang::rlang_error
-derived errors. What we would like is to have two display modes: a shorter form which prints only part of errors/warnings, which is used everywhere (especially when the error is uncaught), and a longer form which prints all of thecargo
errors/warnings, which should be accessed usingsummary
S3
method.Problem: To print
rextendr::rextendr_error
in short form, we need to be able to carefully wrap error messages and subsetn
lines from each error message, preserving format. Otherwise, we will lose useful information like where in the source code the error occurred. It is desirable to offload as much formatting as possible to {rlang} or any other base-type methods.Solution:
We can achieve this by overloading twoThis is blocked by https://github.com/r-lib/rlang/issues/1205S3
methods:conditionMessage
is invoked by {rlang} when formatting its body. This method provides the output of the base class plus additional information aboutcargo
errors in short form. This implementation automatically propagates to other printing methods.format
is invoked when formatting, e.g., forsummary
method. We can use one of the {rlang} parameters to determine if the output should be simplified or detailed. When detailed, we outputcargo
errors in the long form. This automatically enablessummary()
to print detailed information.Drawbacks:
Due to {rlang}'s implementation of formatting methods, callingNextMethod()
informat()
results in an infninte recursion.Temporary fix: InFix: Limitations are avoided using options.rextendr::format.rextendr_error
temporarily strip error object of all"rextendr_*"
classes and then dispatchS3
, which will then resolve into correct {rlang} generics, avoiding infinite recursion. This, however, incorrectly prints the error type.The best part: reproducible example:
https://github.com/Ilia-Kosenkov/rextendr/tree/callr-cargo Successful CI checks of this branch https://github.com/Ilia-Kosenkov/rextendr/actions/runs/838499744
Before
``` r # Catching error err <- tryCatch(rextendr::rust_function("fn invalid syntax(){}", quiet = TRUE), error = identity) # Short form print(err) #>After
``` r # Catching error err <- tryCatch(rextendr::rust_function("fn invalid syntax(){}", quiet = TRUE), error = identity) # Short form print(err) #>