sdllc / Basic-Excel-R-Toolkit

http://bert-toolkit.com
207 stars 42 forks source link

mixed type arrays #84

Open StatsMan53 opened 6 years ago

StatsMan53 commented 6 years ago

Making BERT functions mimic Excel functions may not be totally possible, since R requires type consistency within an array and Excel does not. With that caveat, consider the following examples of divergence between behavior of BERT and Excel functions.

Place the following two functions in the BERT\functions directory

ave <- function(x, trim=0, na.rm=TRUE) mean(x, trim=trim, na.rm=na.rm) pnorm <- function(q, mean=0, sd=1, lower.tail=TRUE, log.p=FALSE) stats::pnorm(q, mean, sd, lower.tail, log.p)

In A2:C2, place the numbers 2, 1, 0 and in A3:C3 place the numbers -3, -2, -1.
In E2 place the BERT function =R.Add(A2:C3), and in E3 place the Excel function =SUM(A2:C3). In G2:I3, array enter the BERT function =R.pnorm(A2:C3) and in K2:M3 array enter the Excel function =NORMSDIST(A2:C3). The corresponding BERT and Excel functions behave exactly the same, even down to preserving the shape of the output. Great job!

Copy A2:M3, and paste it into 3 more locations: A5:M6, A8:M9, and A11:M12. Delete the contents of B5. Change B8 to the text string "NA". Change B11 to the text string "1" (prefix 1 with a single quote so that Excel will parse it as text instead of coercing it to a number.

None of the BERT functions behave like Excel functions in these three cases. The last two cases are likely not feasible, since BERT would not know the data type expected by the R function it is calling, but it might be possible for BERT to pass NA's in place of empty cells if the remainder of the cells are numeric.

image

StatsMan53 commented 6 years ago

If I add a function

numeric.xl <- function(x) { x[sapply(x, function(val) !is.numeric(val))] <- NA x }

It correctly preserves the shape of x while converting all non-numeric values to NA (to see this, replace R.pnorm with numeric.xl in the above example). However, it does not work (and gives no clues in the log) when called from another BERT function. For example if I rewrite the original functions as

ave <- function(x, trim=0, na.rm=TRUE) mean(numeric.xl(x), trim=trim, na.rm=na.rm) pnorm <- function(q, mean=0, sd=1, lower.tail=TRUE, log.p=FALSE) stats::pnorm(numeric.xl(q), mean, sd, lower.tail, log.p)

I get no change from the original results.

Explanation? Suggestions?

StatsMan53 commented 6 years ago

On further reflection, a bit more effort does the trick

numeric.xl <- function(x) { x[sapply(x, function(val) !is.numeric(val))] <- NA ret <- as.numeric(x) dim(ret) <- dim(x) ret }

A simple x <- is.numeric(x) is undesirable, because it coerces coercable strings. This wrapper function does not completely mimic native Excel functions (e.g Excel's =AVERAGE(A5:C6) ignores the empty cell in B5, while =NORMSDIST(A5:C6) coerces the empty cell to 0), but is both useful and more consistent than Excel.

It still puzzles me that NA in R maps to #N/A in Excel, but #N/A in Excel does not map to NA in R.