sdllc / Basic-Excel-R-Toolkit

http://bert-toolkit.com
206 stars 42 forks source link

Debugging features in BERT #77

Open addnox opened 6 years ago

addnox commented 6 years ago

Hi, now the debugging process is a bit tedious by using print()

Could BERT support browser() so that making the life of debugging a bit easier?

Thanks

PGS62 commented 6 years ago

Hi

I'm not on the BERT development team, but have you considered using RStudio (or any other R IDE) for your debugging? That works really well for me, much easier than peppering code with print() statements.

In RStudio you need to have the files in your BERT startup folder open (so that you can set breakpoints etc). Next, when you see an error returned from R to Excel or VBA, save the R environment, e.g. by executing code in the VBA immediate window: Application.Run "BERT.Exec", "save.image(""c:/temp/REnv.tmp"")"

Switch to the RStudio Console and enter rm(list = ls()) load("c:/temp/REnv.tmp")

This should put the R session attached to RStudio in exactly the same state as BERT's R session, (though you may have to install relevant packages etc)

So now run in RStudio whatever code that was failing in BERT. You should see exactly the same error and be in a much better position to debug it.

duncanwerner commented 6 years ago

The reason we don't support browser (or debug) is because we embed R in the Excel process. This requires that we use the embedded R interface, which doesn't allow it. (And we want to use stock R without modification, that's a design requirement).

The reason we embed R in the Excel process, instead of running out-of-process, is for performance. Otherwise we have to serialize data over some channel and pass back and forth. Originally BERT was only used for Excel functions, not VBA or scripting so the focus was always on Excel function performance. The reality is that the actual function call itself is usually not that expensive relative to the calculation -- we rarely have 1000 separate Excel functions calling R in a single spreadsheet.

If we switch to running R out of process, it might slow down Excel/R functions a bit but it would be a lot easier to manage. It would also support browser (and debug) and the abort function described in another issue. So this is something we might do in the future. As it happens we have an out-of-process R module lying around, so that part would be easy, but we still have to rewrite BERT to use it.

addnox commented 6 years ago

@PGS62 Thanks. This approach may not be working when debugging a UDF.

Say I have a function as follows: f <- function() {x <- 1:10; stop("!")}

When I type =r.f() in excel, an error will return. But in this moment, BERT has jumped back from function f's environment back to the Global environment. So if trying your approach, we cannot see x (i.e. 1:10) in the saved image

addnox commented 6 years ago

@duncanwerner

I totally agree that the overhead of passing data is less of an issue. In most cases we can use array formula and vectorized R functions to significantly reduce the overhead

PGS62 commented 6 years ago

@addnox

Yes you are correct that x wouldn't appear in the saved image. The purpose of saving and reloading the image is to ensure that the global environment in the RStudio R matches that in the BERT R at the time that your UDF commences execution. So in your example, you would need to execute f() from the RStudio console and step through the code until the error (stop("!")) happens.

Of course most UDFs should be designed to only operate on their input data passed as function arguments and not "help themselves" to other data, so the saving and reloading may not be necessary, but has the benefit of ensuring that your functions are safely in the environment (there may be packages to worry about too).

A separate technique that I find helpful is to ensure that all R code that I call from Excel via BERT has error handling (via tryCatch) so that if an error (or possibly warning) happens in the R code then the call in Excel to =R.Myfunction(...) returns a string giving the $message element of the error in R. To help distinguish between "strings that describe errors" and all other strings (and following way that Excel represents errors #NULL!, #REF! etc) I prefix such error strings with "#" and suffix with "!".

Putting this together you might you might have:

f<-function(){x <- 1:10; stop("This is what went wrong")}

f2<-function(...){TrapErrors(f(...))}

TrapErrors<-function(x){
  out <- tryCatch({x
  }, error = function(e) {
    paste0("#",e$message,"!")
  },
  warning = function(e) {
    paste0("#",e$message,"!")}
  )
return(out)
}

Then from Excel: =R.f() yields #NULL! but =R.f2() yields "#This is what went wrong!" which is designed to be more helpful.