ycphs / openxlsx

openxlsx - a fast way to read and write complex xslx files
https://ycphs.github.io/openxlsx/
Other
220 stars 74 forks source link

Several issues with `dataValidation(type = "list")` #386

Closed olivier7121 closed 11 months ago

olivier7121 commented 1 year ago

Under Windows 10 and Excel 2007, I cannot create drop-down lists (Data -> Data Validation -> List) in Excel files I am generating with the function dataValidation and parameter type = "list" of the library openxlsx.

I am using R 4.2.1 (the latest available to date on CRAN). I tried with both the latest stable CRAN release (4.2.5) and with the latest 'stable' development version (4.2.5.9000) available here of openxlsx. Results don't differ between the 2 versions.

BTW, in relation to this issue, a warning is still generated by this function.

I did my homework and read carefully that page and this page (or comment).

Under Ubuntu 20 and OpenOffice Calc 5.3 the function does work (but not the hyperlink), i.e. the drop-down list is displayed as expected.

The details of the results are available in the comments in the reproducible example copied below. See also this question on stackoverflow. Please comment/uncomment according to what you need to check.

# Latest stable CRAN version (4.2.5)
library("openxlsx")
#install.packages(c("Rcpp", "remotes"), dependencies = TRUE, lib = "G:/Programmation/R/R-4.2.1/library")
# Latest 'stable' development version (4.2.5.9000)
#remotes::install_github("ycphs/openxlsx", lib = "G:/Programmation/R/R-4.2.1/library")

OutputFolder <- file.path(".", "Output")
if(!dir.exists(OutputFolder))   dir.create(OutputFolder)

OutputFile <- file.path(OutputFolder, "Reprex_Openxlsx_dataValidation_list.xlsx")

Workbook4Export <- createWorkbook()

addWorksheet(wb = Workbook4Export, sheetName = "Tab_1", zoom = 80, gridLines = FALSE)

writeData(wb = Workbook4Export, sheet = "Tab_1", x = 2016, startRow = 1, startCol = 3)
writeData(wb = Workbook4Export, sheet = "Tab_1", x = 2017, startRow = 1, startCol = 4)
writeData(wb = Workbook4Export, sheet = "Tab_1", x = 2018, startRow = 1, startCol = 5)
writeData(wb = Workbook4Export, sheet = "Tab_1", x = 2019, startRow = 1, startCol = 6)

v <- 2016:2019

# R version 4.2.1

dataValidation(wb = Workbook4Export, sheet = "Tab_1", rows = 1, col = 1, type = "list", value = "'Tab_1'!C1:F1")    # Warning + KO on Windows 10 and Excel 2007; No warning + OK on Ubuntu 20 and OpenOffice Calc 5.3
#dataValidation(wb = Workbook4Export, sheet = "Tab_1", rows = 1, col = 1, type = "list", value = "2016,2017,2018,2019") # Warning + KO (+ corrupted/damaged file) on Windows 10 and Excel 2007; No warning + KO on Ubuntu 20 and OpenOffice Calc 5.3 (Err:509)
#dataValidation(wb = Workbook4Export, sheet = "Tab_1", rows = 1, col = 1, type = "list", value = "2016, 2017, 2018, 2019")  # Warning + KO (+ corrupted/damaged file) on Windows 10 and Excel 2007; No warning + KO on Ubuntu 20 and OpenOffice Calc 5.3 (Err:509)
#dataValidation(wb = Workbook4Export, sheet = "Tab_1", rows = 1, col = 1, type = "list", value = '"2016,2017,2018,2019"')   # Warning + KO on Windows 10 and Excel 2007; No warning + OK on Ubuntu 20 and OpenOffice Calc 5.3
#dataValidation(wb = Workbook4Export, sheet = "Tab_1", rows = 1, col = 1, type = "list", value = '"2016, 2017, 2018, 2019"')    # Warning + KO on Windows 10 and Excel 2007; No warning + OK on Ubuntu 20 and OpenOffice Calc 5.3
#dataValidation(wb = Workbook4Export, sheet = "Tab_1", rows = 1, col = 1, type = "list", value = paste0('"', paste0(v, collapse = ","), '"'))   # Warning + KO on Windows 10 and Excel 2007; No warning + OK on Ubuntu 20 and OpenOffice Calc 5.3
#dataValidation(wb = Workbook4Export, sheet = "Tab_1", rows = 1, col = 1, type = "list", value = "\",2016, 2017, 2018, 2019\"") # Warning + KO on Windows 10 and Excel 2007; No warning + OK on Ubuntu 20 and OpenOffice Calc 5.3
#dataValidation(wb = Workbook4Export, sheet = "Tab_1", rows = 1, col = 1, type = "list", value = '",2016, 2017, 2018, 2019"')   # Warning + KO on Windows 10 and Excel 2007; No warning + OK on Ubuntu 20 and OpenOffice Calc 5.3
writeFormula(wb = Workbook4Export, sheet = "Tab_1", startRow = 1, startCol = 2, x = '= HYPERLINK("[Reprex_Openxlsx_dataValidation_list.xlsx]Tab_1!" & CELL("address", INDEX(C1:F1, MATCH(A1, C1:F1, 0))), "Go to the selected column")')    # KO on Ubuntu 20 and OpenOffice Calc 5.3

saveWorkbook(wb = Workbook4Export, file = OutputFile, overwrite = TRUE)
JanMarvin commented 1 year ago

Hi @olivier7121 , thanks for the report. I have tested the first example and it works fine with the development branch of openxlsx. Could you please have a look if this solves your issue as well?

remotes::install_github("ycphs/openxlsx", lib = "G:/Programmation/R/R-4.2.1/library", ref = "development")

The following works for me:

library(openxlsx)

temp <- "Reprex_Openxlsx_dataValidation_list.xlsx"

Workbook4Export <- createWorkbook()

addWorksheet(wb = Workbook4Export, sheetName = "Tab_1", zoom = 80, gridLines = FALSE)

writeData(wb = Workbook4Export, sheet = "Tab_1", x = 2016, startRow = 1, startCol = 3)
writeData(wb = Workbook4Export, sheet = "Tab_1", x = 2017, startRow = 1, startCol = 4)
writeData(wb = Workbook4Export, sheet = "Tab_1", x = 2018, startRow = 1, startCol = 5)
writeData(wb = Workbook4Export, sheet = "Tab_1", x = 2019, startRow = 1, startCol = 6)

v <- 2016:2019

# R version 4.2.1

dataValidation(wb = Workbook4Export, sheet = "Tab_1", rows = 1, col = 1, type = "list", value = "'Tab_1'!C1:F1")
dataValidation(wb = Workbook4Export, sheet = "Tab_1", rows = 2, col = 1, type = "list", value = '"2016,2017,2018,2019"')
dataValidation(wb = Workbook4Export, sheet = "Tab_1", rows = 3, col = 1, type = "list", value = '"2016, 2017, 2018, 2019"')
dataValidation(wb = Workbook4Export, sheet = "Tab_1", rows = 4, col = 1, type = "list", value = '"2016,2017,2018,2019"')
dataValidation(wb = Workbook4Export, sheet = "Tab_1", rows = 5, col = 1, type = "list", value = '"2016, 2017, 2018, 2019"')

writeFormula(wb = Workbook4Export, sheet = "Tab_1", startRow = 1, startCol = 2, array = TRUE,
             x = '= HYPERLINK("[Reprex_Openxlsx_dataValidation_list.xlsx]Tab_1!" & CELL("address", INDEX(C1:F1, MATCH(A1, C1:F1, 0))), "Go to the selected column")')

saveWorkbook(Workbook4Export, temp, TRUE)
openXL(temp)

But be aware that the formula will be broken unless array = TRUE is added. Otherewise Excel will add an @ sign because the formula is intended to be evaluated as "hidden" array formula. (See also https://github.com/JanMarvin/openxlsx2/issues/169)

PS: Unfortunately checks with anything but Excel do not really help. Had to learn that the hard way. Excel has a few quirks that are pretty unique and I'm not even sure if all of them are intended.

olivier7121 commented 1 year ago

Thanks for the quick reply, @JanMarvin.

I tried with the line you recommended: remotes::install_github("ycphs/openxlsx", lib = "G:/Programmation/R/R-4.2.1/library", ref = "development")

Now my R session uses the version 4.2.5.1 of openxlsx.

Then I tried each of the following lines: dataValidation(wb = Workbook4Export, sheet = "Tab_1", rows = 1, col = 1, type = "list", value = "'Tab_1'!C1:F1") dataValidation(wb = Workbook4Export, sheet = "Tab_1", rows = 1, col = 1, type = "list", value = paste0('"', paste0(v, collapse = ","), '"')) dataValidation(wb = Workbook4Export, sheet = "Tab_1", rows = 1, col = 1, type = "list", value = '"2016,2017,2018,2019"')

But none of them works for me.

Are you using Windows 10 and Excel 2007 too? (See my detailed comments after each line in my original reproducible example.)

BTW in my case I don't have any issue with the formula, it is not broken. But thanks for the tip.

JanMarvin commented 1 year ago

Hi @olivier7121 , all of the following work:

V <- sprintf('"%s"', paste(v, collapse = ","))
dataValidation(wb = Workbook4Export, sheet = "Tab_1", rows = 1, col = 1, type = "list", value = "'Tab_1'!C1:F1")
dataValidation(wb = Workbook4Export, sheet = "Tab_1", rows = 2, col = 1, type = "list", value = V)
dataValidation(wb = Workbook4Export, sheet = "Tab_1", rows = 3, col = 1, type = "list", value = '"2016,2017,2018,2019"')

Though data validation lists are a feature introduced with Office 2010 (it requires internally: "http://schemas.microsoft.com/office/spreadsheetml/2009/9/main" (that's just some xml namespace to be evaluated by excel) which makes me guess Office 2010, could be 2013 as well).

[Edit:] I was curious if Office 2007 was supported at all, but according to the wikipedia page support ended in October 2017, nearly 5 years ago. Maybe it's about time to ditch that ticking time bomb of outdated and insecure software? I'm not an advocate for Office, but running such old software kinda frightens me.

olivier7121 commented 1 year ago

It doesn't work on my computer with Windows 10 and Excel 2007.

Your link is dead.

No, data validation lists already existed in Excel 2007 (and most likely already before 2007). I am using them (outside R of course) without any problem on my computer. Please see this page: https://support.microsoft.com/en-us/office/create-a-drop-down-list-7693307a-59ef-400a-b769-c5402dce407b See also: https://support.microsoft.com/en-us/office/add-or-remove-items-from-a-drop-down-list-0b26d3d1-3c4d-41f5-adb4-0addb82e8d2c

Maybe it's about time to ditch that ticking time bomb of outdated and insecure software? I'm not an advocate for Office, but running such old software kinda frightens me.

Yes, I will probably update it at some point. But there is no danger as I use it offline. What you mention is exaggerated.

JanMarvin commented 1 year ago

Well, you could try to change that in code. After all it's open source. Maybe using something like ".../2006/6/main" is sufficient.

I don't use not have access to this ancient office version (there's been 2010, 2013, 2017, guess 2020 and MS365 since then). And cannot help you with that. Works with MS365 (you could use that in a web browser free of charge).

JanMarvin commented 1 year ago

You'd have to make changes here:

https://github.com/ycphs/openxlsx/blob/e7b699eb8acd949616e15557e373e8aea130cb44/R/worksheet_class.R#L228-L229

And here:

https://github.com/ycphs/openxlsx/blob/e7b699eb8acd949616e15557e373e8aea130cb44/R/WorkbookClass.R#L2567-L2581

Basically: replace all x14 with x(?) and change the schema in the first section to .../2006/6/main (?). You could see what is required, when creating an xlsx with a data validation list from your Excel 2007. Afterwards you can open the xlsx file with a zip tool and should be able to find the required values are at the bottom of xl/worksheets/sheet1.xml.

Like I've said, I cannot do that, since I do not have access to Excel 2007 and I do not even know if the approach is identical to the current approach.

olivier7121 commented 1 year ago

Hi @JanMarvin.

Though data validation lists are a feature introduced with Office 2010 (it requires internally: "http://schemas.microsoft.com/office/spreadsheetml/2009/9/main" (that's just some xml namespace to be evaluated by excel) which makes me guess Office 2010, could be 2013 as well).

I didn't understand what this link in your message was for, now I understand a bit (not much) better.

Well, you could try to change that in code. After all it's open source. Maybe using something like ".../2006/6/main" is sufficient.

This message was also quite cryptic to me.

Works with MS365 (you could use that in a web browser free of charge).

Thanks for the tip. It indeed works. That is really strange that it doesn't work with Office 2007...

You could see what is required, when creating an xlsx with a data validation list from your Excel 2007. Afterwards you can open the xlsx file with a zip tool and should be able to find the required values are at the bottom of xl/worksheets/sheet1.xml.

The bottom of xl/worksheets/sheet1.xml of my Excel 2007 file looks like this:

-<extLst xmlns:x14="http://schemas.microsoft.com/office/spreadsheetml/2009/9/main">
-<ext uri="{CCE6A557-97BC-4b89-ADB6-D9C93CAAB3DF}">
-<x14:dataValidations count="1" xmlns:xm="http://schemas.microsoft.com/office/excel/2006/main">
-<x14:dataValidation showErrorMessage="1" showInputMessage="1" allowBlank="1" type="list">
-<x14:formula1>
<xm:f>"2016,2017,2018,2019"</xm:f>
</x14:formula1>
<xm:sqref>A1</xm:sqref>
</x14:dataValidation>
</x14:dataValidations>
</ext>
</extLst>

So at first sight I would say it's not so much different than what you posted in your previous message, is it?

JanMarvin commented 1 year ago

Hi @olivier7121 ,

haha, sorry for confusing you (it was a short night and I typed my response quickly after breakfast). The snipped you have added looks indeed quite similar to what I would expect a file would look like if written by us.

I have two questions for you

And I cannot promise anything. Working with the open xml file format is a bit like a puzzle. Usually you try something, open the file in Excel and see if it works. In this case its a lot harder, because it already works on my end. Therefore even if I spot something, I can not test it and you'd have to.

olivier7121 commented 1 year ago

Yes, I noticed from your posting times that your night had probably been quite short^^

Could you by any chance copy the content of the header of sheet1.xml or the entire file into a gist file over at https://gist.github.com?

Done: https://gist.github.com/olivier7121/aa8e2ff80b37e37630fe267439e3c221

Is there something specific in the debug message from Excel? Does it mention that the dataValidation part was broken?

Nothing, Excel (2007) is completely silent.

And I cannot promise anything. Working with the open xml file format is a bit like a puzzle. Usually you try something, open the file in Excel and see if it works. In this case its a lot harder, because it already works on my end. Therefore even if I spot something, I can not test it and you'd have to.

Don't worry, I understand. To be very honest I found this evening on an external drive a more recent version of Office (2013) that I completely forgot. I installed it and everything works fine now. But I still have Office 2017 installed on another machine so I will be able to perform tests.

JanMarvin commented 1 year ago

Hi @olivier7121 , sorry to bother you again. It looks like the sheet1.xml file you've uploaded is the one created by openxlsx. Though what I was asking for was a matching one created by Excel 2007 that I can use to inspect for differences between Excel 2007 and openxlsx. Could you provide this from a file you have created with Excel 2007 in a similar fashion:

olivier7121 commented 1 year ago

Hi @JanMarvin. Oh yes, you're right, my mistake, apologies for that.

Here you are: https://gist.github.com/olivier7121/d4804f52131c6963aa159a197fca069e

[EDIT]: Just updated the gist file. Now it should be OK.

JanMarvin commented 1 year ago

Thanks! Your dataValidatations is not in ext. Quite interesting. Will check how this differs from our current approach. Maybe just a visual thing, but afaik when I tested this in openxlsx2 it did not work at all.

JanMarvin commented 1 year ago

This ext section has caused me some headaches, so I'd be glad if we could simply get rid of it. Follow up question regarding the Excel behavior you're seeing. Is there a warning that the file has newer features or does it open silently and the list selection is simply not available or doesn't it open at all?

[edit] What does this exactly mean 😁

Warning + KO on Windows 10 and Excel 2007

olivier7121 commented 1 year ago

Follow up question regarding the Excel behavior you're seeing. Is there a warning that the file has newer features or does it open silently and the list selection is simply not available or doesn't it open at all?

No warning, it opens silently and the list selection is simply not available.

What does this exactly mean 😁

Warning + KO on Windows 10 and Excel 2007

Warning in R and no list selection available in Excel 2007 on Windows 10.

github-actions[bot] commented 11 months ago

This issue is stale because it has been open 365 days with no activity. Remove stale label or comment or this will be closed in 7 days.

github-actions[bot] commented 11 months ago

This issue was closed because it has been stalled for 7 days with no activity.