ycphs / openxlsx

openxlsx - a fast way to read and write complex xslx files
https://ycphs.github.io/openxlsx/
Other
220 stars 74 forks source link

corrupted XLSX file #413

Closed davidgohel closed 1 year ago

davidgohel commented 1 year ago

Describe the bug

Using a specific template, user can't write a valid xlsx file.

To Reproduce

Questionnaire FAJ_2022.xlsx

library(openxlsx)
test_xlsx <- loadWorkbook(file="~/Downloads/Questionnaire FAJ_2022.xlsx")
# writeData(test_xlsx, "zzz",
#           x = "blah", xy=c(2,7), colNames = FALSE)
saveWorkbook(test_xlsx, "out.xlsx")

Created on 2023-03-03 with reprex v2.0.2

Session info ``` r sessioninfo::session_info() #> ─ Session info ─────────────────────────────────────────────────────────────── #> setting value #> version R version 4.2.2 (2022-10-31) #> os macOS Big Sur ... 10.16 #> system x86_64, darwin17.0 #> ui X11 #> language (EN) #> collate en_US.UTF-8 #> ctype en_US.UTF-8 #> tz Europe/Paris #> date 2023-03-03 #> pandoc 2.19.2 @ /Applications/RStudio.app/Contents/Resources/app/quarto/bin/tools/ (via rmarkdown) #> #> ─ Packages ─────────────────────────────────────────────────────────────────── #> package * version date (UTC) lib source #> cli 3.6.0 2023-01-09 [1] CRAN (R 4.2.0) #> digest 0.6.31 2022-12-11 [1] CRAN (R 4.2.2) #> evaluate 0.20 2023-01-17 [1] CRAN (R 4.2.0) #> fastmap 1.1.1 2023-02-24 [1] CRAN (R 4.2.0) #> fs 1.6.1 2023-02-06 [1] CRAN (R 4.2.0) #> glue 1.6.2 2022-02-24 [1] CRAN (R 4.2.0) #> htmltools 0.5.4 2022-12-07 [1] CRAN (R 4.2.0) #> knitr 1.42 2023-01-25 [1] CRAN (R 4.2.0) #> lifecycle 1.0.3 2022-10-07 [1] CRAN (R 4.2.0) #> magrittr 2.0.3 2022-03-30 [1] CRAN (R 4.2.0) #> openxlsx * 4.2.5.2 2023-02-06 [1] CRAN (R 4.2.0) #> purrr 1.0.1 2023-01-10 [1] CRAN (R 4.2.0) #> R.cache 0.16.0 2022-07-21 [1] CRAN (R 4.2.0) #> R.methodsS3 1.8.2 2022-06-13 [1] CRAN (R 4.2.0) #> R.oo 1.25.0 2022-06-12 [1] CRAN (R 4.2.0) #> R.utils 2.12.2 2022-11-11 [1] CRAN (R 4.2.0) #> Rcpp 1.0.10 2023-01-22 [1] CRAN (R 4.2.0) #> reprex 2.0.2 2022-08-17 [1] CRAN (R 4.2.0) #> rlang 1.0.6 2022-09-24 [1] CRAN (R 4.2.0) #> rmarkdown 2.20 2023-01-19 [1] CRAN (R 4.2.0) #> rstudioapi 0.14 2022-08-22 [1] CRAN (R 4.2.0) #> sessioninfo 1.2.2 2021-12-06 [1] CRAN (R 4.2.0) #> stringi 1.7.12 2023-01-11 [1] CRAN (R 4.2.0) #> styler 1.9.0 2023-01-15 [1] CRAN (R 4.2.0) #> vctrs 0.5.2 2023-01-23 [1] CRAN (R 4.2.0) #> withr 2.5.0 2022-03-03 [1] CRAN (R 4.2.0) #> xfun 0.37 2023-01-31 [1] CRAN (R 4.2.0) #> yaml 2.3.7 2023-01-23 [1] CRAN (R 4.2.0) #> zip 2.2.2 2022-10-26 [1] CRAN (R 4.2.0) #> #> [1] /Users/davidgohel/Library/R/x86_64/4.2/library #> [2] /Library/Frameworks/R.framework/Versions/4.2/Resources/library #> #> ────────────────────────────────────────────────────────────────────────────── ```

Screenshots If applicable, add screenshots to help explain your problem.

Capture d’écran 2023-03-03 à 12 52 00

Additional context

Looking at the produced XML, invalidation may be because some tags are repeated and some diseapeared:

In original file xl/worksheets/sheet1.xml:

[...]
  </mergeCells>
  <pageMargins left="0.7" right="0.7" top="0.75" bottom="0.75" header="0.3" footer="0.3"/>
  <pageSetup paperSize="9" scale="54" orientation="portrait" r:id="rId2"/>
  <headerFooter>
    <oddHeader xml:space="preserve">&amp;CQuestionnaire FAJ</oddHeader>
  </headerFooter>
</worksheet>

In new file, some tags are repeated:

Capture d’écran 2023-03-03 à 12 55 53
JanMarvin commented 1 year ago

@davidgohel, thanks for the bug report. I'm afraid there is no active development here. You are welcome to try out openxlsx2 where importing and exporting works as intended. Let me know if you need help with the transition.

library(openxlsx2)
tmp <- temp_xlsx()
url <- "https://github.com/ycphs/openxlsx/files/10881597/Questionnaire.FAJ_2022.xlsx"
wb_load(url) %>% wb_save(tmp)
davidgohel commented 1 year ago

@JanMarvin thanks, we'll try next week, we already started to look at it. I will read the documentation before asking question; for the context, I am working for an organization that want to publish Excel files for a survey (they want to fill this excel files from a template with individual data, one file per person)

JanMarvin commented 1 year ago

Something like this should get you started, but I'm not going to hijack this issue further.

library(openxlsx2)
url <- "https://github.com/ycphs/openxlsx/files/10881597/Questionnaire.FAJ_2022.xlsx"
wb <- wb_load(url)

id <- openxlsx2:::random_string(length = 16)
chars1 <- matrix(openxlsx2:::random_string(4, length = 3), 4, 1)
chars2 <- matrix(openxlsx2:::random_string(4, length = 3), 4, 1)

numbers <- matrix(rnorm(5*4), 5, 4)

wb$add_data(dims = "B7", x = id, colNames = FALSE)
wb$add_data(dims = "B8:B11", x = chars1, colNames = FALSE)
wb$add_data(dims = "D8:D11", x = chars2, colNames = FALSE)
wb$add_data(dims = "B17:E21", x = numbers, colNames = FALSE)

wb$save("test_file.xlsx")
davidgohel commented 1 year ago

This is very nice and clear, thank you!

davidgohel commented 1 year ago

I am closing the issue, the user I was helping migrated with success on her own to openxlsx2.