nacnudus / tidyxl

Read untidy Excel files in R https://nacnudus.github.io/tidyxl/
https://nacnudus.github.io/tidyxl/
Other
246 stars 20 forks source link

dataValidations not detected in Office 365 files #46

Closed davidski closed 5 years ago

davidski commented 5 years ago

xlsx_validation() doesn't seem to pick up data validation rules when the list of allowed values is on a different sheet. This (test_validation.xlsx) test file generates zero detected data validation rules, even though one is present.

Tested syntax

tidyxl::xlsx_validation(here::here("data/test_validation.xlsx"))

and

tidyxl::xlsx_validation(here::here("data/test_validation.xlsx"), sheets = c(1, 2))
nacnudus commented 5 years ago

Thanks for the report and the reproducible example. I can confirm tidyxl doesn't detect the rule. Strangely Excel 2016 for PC doesn't either, but LibreOffice Calc does.

The XML of the rule is unusual. What software was used to create the file?

davidski commented 5 years ago

The test file was generated with Excel for Mac (Office 365 - V16.16.4). I have an internal file using a similar style of validation rule created with Excel for Windows (Office 365) that generates the same problem. 😦

Thanks for the quick response and the handy package!

nacnudus commented 5 years ago

Ah, thanks for checking both Mac and Windows. Can I ask you one more favour, please could you send a file with a formula in one cell? The bug is to do with the way the formula (the range of the list) is expressed in XML, so I'd like to check whether it affects normal formulas in cells.

davidski commented 5 years ago

Sure thing! Here's a basic one.

test_validation_w_formula.xlsx

nacnudus commented 5 years ago

Similar bug to https://github.com/PHPOffice/PhpSpreadsheet/issues/388. Need to handle x14 and xm extensions to the spec.

nacnudus commented 5 years ago

@davidski Please try devtools::install_github(nacnudus/tidyxl), and reopen this issue if it doesn't work. Thanks for the helpful report and files.

I'm curious what you use the xlsx_validation() for in case I could support the workflow better somehow. It's a pretty niche function.

davidski commented 5 years ago

Thanks! An install off of master seems to be working!

The use case on this is a bit odd - I have some Excel workbooks that are used for structured data entry (uploading) into a commercial closed source tool. The workbooks have some validation applied to ensure the data is put into the correct format, but the rules themselves are not well documented. I am looking to use tidyxl to pull out the validation rules and reverse engineer them, allowing these uploads to be created with other tools (like Shiny, for instance).

Appreciate the fix on this!

nacnudus commented 5 years ago

That sounds niche enough! Thanks for the explanation, I hope it works out.