ycphs / openxlsx

openxlsx - a fast way to read and write complex xslx files
https://ycphs.github.io/openxlsx/
Other
226 stars 75 forks source link

saving workbook twice causes issues with setColWidths #314

Closed JanMarvin closed 1 year ago

JanMarvin commented 2 years ago

Hello,

it looks like that I'm still observing a similar issue as described above:

When I try to store the same workbook in a second xlsx file using saveWorkbook() i get the warning 'NAs introduced by coercion' in R. When I open the file, it looks empty on the first glance but actually the data is in the file but the columns are hidden in excel and can be made visible by manually unhiding them in Excel.

#---------------- mini example

data(iris)
head(iris)

wb.test <- createWorkbook()

addWorksheet(wb.test, "Iris", zoom=80) 
writeDataTable(wb.test, "Iris", iris)
setColWidths(wb.test, "Iris", cols = c(1:ncol(iris)), widths = "auto")

saveWorkbook(wb.test, file = file.path(path,"test.xlsx"), overwrite=T, returnValue = T)
saveWorkbook(wb.test, file = file.path(path,"test2.xlsx"), overwrite=T, returnValue = T

Here the two files that are generated by the code above: test2.xlsx test.xlsx

If I omit the setColWidths() function, saving into two files does work without any issues.

I'm using RVersion 4.0.2. via RStudio and openxslx Version 4.2.5.

Do you have any recommendation, how to solve this issue?

Thanks a lot! Verena

Originally posted by @verena-e in https://github.com/ycphs/openxlsx/issues/106#issuecomment-1008869886

arp3ggio commented 2 years ago

I'm having the same issue. Any help would be appreciated.

Below I've fixed a missing parentheses in the mini example from @JanMarvin:

data(iris)
head(iris)

wb.test <- createWorkbook()

addWorksheet(wb.test, "Iris", zoom = 80)
writeDataTable(wb.test, "Iris", iris)
setColWidths(wb.test, "Iris", cols = c(1:ncol(iris)), widths = "auto")

saveWorkbook(wb.test,
  file = file.path("test.xlsx"),
  overwrite = TRUE,
  returnValue = TRUE
)
saveWorkbook(wb.test,
  file = file.path("test2.xlsx"),
  overwrite = TRUE,
  returnValue = TRUE
)
JanMarvin commented 2 years ago

Hi @arp3ggio , it's not my example, I've simply covered a comment into an issue. You could copy the file you've correctly exported via copy.file or by calling setColWidths again, prior to the export. Or dig into the code, why saveWorkbook or wb$saveWorkbook erase or modify the col attribute of the worksheet. Obviously it should not happen, but somewhere it does 😞

arp3ggio commented 2 years ago

Thanks for the follow up. After my comment and continuing to research, I did notice it wasn't your example.

Re: calling setColWidths again, I don't believe that works. Modified example:

data(iris)
head(iris)

wb.test <- createWorkbook()

addWorksheet(wb.test, "Iris", zoom = 80)
writeDataTable(wb.test, "Iris", iris)
setColWidths(wb.test, "Iris", cols = c(1:ncol(iris)), widths = "auto")

saveWorkbook(wb.test,
  file = file.path("test.xlsx"),
  overwrite = TRUE,
  returnValue = TRUE
)

setColWidths(wb.test, "Iris", cols = c(1:ncol(iris)), widths = "auto") # new line

saveWorkbook(wb.test,
  file = file.path("test2.xlsx"),
  overwrite = TRUE,
  returnValue = TRUE
)

Regarding file.copy, I'm not quite sure how to make it work in my situation. My master Excel wb / file is 18 sheets that are heavily formatted. My hope was to derive multiple sub Excel wbs / files that only contain a subset of the sheets from the master file (using removeWorksheet). However, after creating the first sub wb that was only 12 of 18 sheets found in the master, I ran into the error at issue. I'm not sure how many sub Excel wb / files I'll need, so I'm hoping I can avoid doing the extensive file formatting (addWorksheet, writeData, freezePane, addStyle multiple times, conditionalFormatting multiple times, setColWidths, groupColumns, etc.)

I'm going to keep poking at it and if I come across a workaround, I'll post it here. In the interim, I'm willing to try any additional ideas. Thank you

arp3ggio commented 2 years ago

I did take a look at the code of the saveWorkbook function:

function(wb, file, overwrite = FALSE, returnValue = FALSE) {
  op <- get_set_options()
  on.exit(options(op), add = TRUE)
  if (!"Workbook" %in% class(wb)) {
    stop("First argument must be a Workbook.")
  }
  if (!is.logical(overwrite)) {
    overwrite <- FALSE
  }
  if (!is.logical(returnValue)) {
    returnValue <- FALSE
  }
  if (file.exists(file) & !overwrite) {
    stop("File already exists!")
  }
  xlsx_file <- wb$saveWorkbook()
  result <- file.copy(from = xlsx_file, to = file, overwrite = overwrite)
  unlink(dirname(xlsx_file), force = TRUE, recursive = TRUE)
  if (returnValue == FALSE) {
    invisible(1)
  }
  else {
    return(result)
  }
}

But I became confused with the line that includes the function itself:
xlsx_file <- wb$saveWorkbook()

I did receive the same error when running: wb.test$saveWorkbook()

arp3ggio commented 2 years ago

I'm running openxlsx version 4.2.5.

github-actions[bot] commented 1 year ago

This issue is stale because it has been open 365 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] commented 1 year ago

This issue was closed because it has been stalled for 7 days with no activity.

zgcgd commented 1 day ago

I know this is closed, but I just spent an hour or so on this issue, and wanted to post a solution for anyone who ends up on this page in the future.

May seem obvious to try, but... using removeColWidths() and then reapply setColWidths() works. Obviously, if you're using auto and your data is large, this may be slow, but.

This helper function may also help make it more concise. I was using a similar one to auto-set width on all columns of a large number of small sheets. I looked quite a bit, but couldn't find a way to get the number of cols of the sheet (vignettes all use ncol() on the original dataframe), or use a shortcut for all.

removeColWidths_auto <- function(wb, sheets = NULL) {
  #' Auto set col widths, default is all sheets in the workbook.
  if (is.null(sheets)) sheets <- names(wb) # Default to all sheets
  walk(
    sheets,
    \(s) removeColWidths(wb, s, cols = c(1:50)) 
  )
}

Anyway, hope it helps someone save time.