awalker89 / openxlsx

R package for .xlsx file reading and writing.
Other
364 stars 79 forks source link

#NAME? error when writing =AGGREGATE() function to Excel #459

Closed AndrewZaremba closed 5 years ago

AndrewZaremba commented 5 years ago

Expected Behavior

Using writeFormula to write an =AGGREGATE() function into an Excel workbook calculates the correct value.

Actual Behavior

When the Excel file is opened, all formula cells display a #NAME? error. When the cell is refreshed (select and press F2), the formula result correctly displays.

Steps to Reproduce the Problem

openxlsx_test_book.xlsx

  1. Save above workbook into R Working Directory
  2. Run the following code:
# Load necessary package
library(openxlsx)

# Load in test-cast workbook as testBook
testBook <- loadWorkbook("openxlsx_test_book.xlsx")

# Loop to calculate MAXIF by month. Uses =AGGREGATE() instead of traditional array methods.
for(i in 3:3)
{
  for (j in 370:381)
  {
    writeFormula(testBook, "MAX_TEMP", paste0("AGGREGATE(", 14, ",", 6, ",(", int2col(i), 3, ":", 
                                                int2col(i), 367, ")/(", int2col(2), j, "=", int2col(2), 3, ":", 
                                                int2col(2), 367, "),", 1, ")"), 
                 startCol = i, startRow = j)
  }
}

# Saves workbook
saveWorkbook(testBook, "testBook.xlsx", overwrite = TRUE)

# Opens workbook
openXL(file = "testBook.xlsx")
  1. The above script enters the =AGGREGATE() function in cells C370:C381 - EX) "=AGGREGATE(14,6,(C3:C367)/(B370=B3:B367),1)"

  2. Note that the above script has an extra for-loop wrapping the main function. This is because the script is taken from a larger one that iterates across multiple columns.

sessionInfo()

R version 3.5.2 (2018-12-20) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200)

AndrewZaremba commented 5 years ago

I've come across a solution.

"_xlfn." must be added to the beginning of the =AGGREGATE() function in order to calculate correctly.

For example:

# Loop to calculate MAXIF by month. Uses =AGGREGATE() instead of traditional array methods.
for(i in 3:3)
{
  for (j in 370:381)
  {
    writeFormula(testBook, "MAX_TEMP", paste0("_xlfn.AGGREGATE(", 14, ",", 6, ",(", int2col(i), 3, ":", 
                                                int2col(i), 367, ")/(", int2col(2), j, "=", int2col(2), 3, ":", 
                                                int2col(2), 367, "),", 1, ")"), 
                 startCol = i, startRow = j)
  }
}
nathanckuhn commented 4 years ago

Can this be reopened? I have been noticing the same problem and I think it would be worth fixing.

I had almost the same scenario when trying to write this formula: 'IFS( AND(NOT(LEN(INDIRECT("CR"&ROW()))>0),NOT(LEN(INDIRECT("EZ"&ROW()))>0)), "title blank", LEN(INDIRECT("EZ"&ROW()))>0, INDIRECT("EZ"&ROW()), LEN(INDIRECT("CR"&ROW()))>0, INDIRECT("CR"&ROW()), TRUE, "ERROR: something went wrong" )'

My process: I've been using openxlsx to create an Excel document. I had previously been using a formula that worked. I followed "example 2" in the examples for writeFormula() for adding Excel functions (writing an Excel function as a string in a data frame, making that column class "function", and then using writeData to add the data frame to a Worksheet).

That process worked just fine with this function: 'IF(LEN(INDIRECT("EZ"&ROW()))>0, INDIRECT("EZ"&ROW()), INDIRECT("CR"&ROW()))' After creating and opening the Excel, the column with this function would show the data as expected.

However, when I "improved" the function by changing it to this: 'IFS( AND(NOT(LEN(INDIRECT("CR"&ROW()))>0),NOT(LEN(INDIRECT("EZ"&ROW()))>0)), "title blank", LEN(INDIRECT("EZ"&ROW()))>0, INDIRECT("EZ"&ROW()), LEN(INDIRECT("CR"&ROW()))>0, INDIRECT("CR"&ROW()), TRUE, "ERROR: something went wrong" )' ... then upon opening the Excel file, that whole column showed a "#NAME?" error. I could click on each cell, enter to edit or refresh, and then the data would show correctly, but I need it to show correctly from the beginning.

Following the solution @AndrewZaremba found above, and changing the formula to this: '_xlfn.IFS( AND(NOT(LEN(INDIRECT("CR"&ROW()))>0),NOT(LEN(INDIRECT("EZ"&ROW()))>0)), "title blank", LEN(INDIRECT("EZ"&ROW()))>0, INDIRECT("EZ"&ROW()), LEN(INDIRECT("CR"&ROW()))>0, INDIRECT("CR"&ROW()), TRUE, "ERROR: something went wrong" )' ... now it works correctly and displays correctly upon opening the Excel.

THE POINT: Can I request that this thread be open and this bug fixed?

It seems that "_xlfn." is used in older versions of Excel to flag functions that do not work because they were added after 2007 or 2009: https://support.microsoft.com/en-us/office/issue-an-xlfn-prefix-is-displayed-in-front-of-a-formula-882f1ef7-68fb-4fcd-8d54-9fbb77fd5025 and https://excelbuddy.com/_xlfn-prefix-in-excel/

I'm using Office16, so I'm surprised this would be an issue.

Does this mean that openxlsx is based on a pre-2007 version? That may cause other problems too.

One fix could be having openxlsx recognize when this group of functions are being used and just add "_xlfn." to the beginning for them Unless openxlsx is using a pre-2007 version and larger changes are made.

Thank you for the consideration!