nacnudus / tidyxl

Read untidy Excel files in R https://nacnudus.github.io/tidyxl/
https://nacnudus.github.io/tidyxl/
Other
248 stars 21 forks source link

Sheet not found by its index in $data but found in $formats #15

Closed stla closed 7 years ago

stla commented 7 years ago

The weird file: https://www.dropbox.com/s/c1abqpk1ozf6rf1/GBSIextract.xlsx?dl=0

> x <- tidy_xlsx("GBSIextract.xlsx", sheets=1)
> # there's nothing in x$data:
> names(x$data)
character(0)
> names(x$formats)
[1] "local" "style"
> length(x$formats$local$numFmt)
[1] 27

The import works if we provide the sheet name:

> x <- tidy_xlsx("GBSIextract.xlsx", sheets="GBS-Ia")
> names(x$data)
[1] "GBS-Ia"
> dim(x$data$`GBS-Ia`)
[1] 81 20
stla commented 7 years ago

I have another strange case. I don't know whether this is a related issue, but perhaps.

Sheet4 is at second position. This can be seen in Excel, or like this:

> readxl::excel_sheets("++NoteVariab.xlsx")
[1] "Sheet1"  "Sheet4"  "Sheet2"  "Sheet3"  "s a 17"  "s a 0 9"

However:

> txl <- tidyxl::tidy_xlsx("++NoteVariab.xlsx", sheets=2)
> names(txl$data)
[1] "Sheet2"

I will extract a piece of this file before sending it to you, because it is huge.

nacnudus commented 7 years ago

@stla May I add GBSIextract.xlsx to this repository for testing? Otherwise I could hack together a test file.

stla commented 7 years ago

Hmmm... at least change the column headers please. It should be enough in order that nobody can identify the source of these data and this is preferable.

nacnudus commented 7 years ago

I created a test file from scratch, to be safe.