awalker89 / openxlsx

R package for .xlsx file reading and writing.
Other
364 stars 79 forks source link

some headers are not read correctly corrupting the db without error #384

Open ugobob99 opened 6 years ago

ugobob99 commented 6 years ago

Expected Behavior

read the first row of a sheet as column names

Actual Behavior

With some files, the first row in a table contains column names that are not read correctly: the second row is then used as columns names, but it actually contains data. There is no warning or error raised.

This happened so far with xlsx files exported from SAS; these files have a newline in the first row (columns headers). If I read the SAS-produced xlsx file with Excel, save it without any change, then openxlsx reads the first row (which still has the newline character) and the whole sheet correctly

Steps to Reproduce the Problem

(please attach an example xlsx file if possible)

  1. library(openxlsx) d <- read.xlsx("TEST_export_da_sas.xlsx") head(d)

sessionInfo()

R version 3.3.2 (2016-10-31) Platform: x86_64-w64-mingw32/x64 (64-bit) Running under: Windows >= 8 x64 (build 9200)

locale: [1] LC_COLLATE=Italian_Italy.1252 LC_CTYPE=Italian_Italy.1252 LC_MONETARY=Italian_Italy.1252 LC_NUMERIC=C
[5] LC_TIME=Italian_Italy.1252

attached base packages: [1] stats graphics grDevices utils methods base

other attached packages: [1] openxlsx_4.0.17 sas7bdat_0.5 numDeriv_2016.8-1 lhs_0.14 corpcor_1.6.9 mvtnorm_1.0-6 SuppDists_1.1-9.4 [8] MASS_7.3-45 ismev_1.41 mgcv_1.8-15 nlme_3.1-128 evd_2.3-2 gtools_3.5.0

loaded via a namespace (and not attached): [1] Rcpp_0.12.13 lattice_0.20-34 grid_3.3.2 Matrix_1.2-7.1 tools_3.3.2 yaml_2.1.14

ugobob99 commented 6 years ago

Sorry I forgot to attach the example file

TEST_export_da_sas.xlsx

HuoJnx commented 4 years ago

I meet the same problem. But my file is created by Pandas in Python instead of SAS. And the file can be read correctly after I do the same thing as yours! It is so strang~