tidyverse / readxl

Read excel files (.xls and .xlsx) into R 🖇
https://readxl.tidyverse.org
Other
730 stars 195 forks source link

XLS file doesn't open with v1.1.0.9000, does with v1.0.0; example included #509

Closed roualdes closed 5 years ago

roualdes commented 6 years ago

The following XLS file does not open with the latest version of readxl, but does open with v1.0.0.

URL <- "https://www.ers.usda.gov/webdocs/DataFiles/48731/DataDownload.xls"
dest <- "~/Desktop/data.xls"
download.file(URL, dest, mode="wb")
devtools::install_github("tidyverse/readxl") # v1.1.0.90000
df <- readxl::read_xls(dest, sheet="LOCAL") # doesn't work 

devtools::install_version("readxl", version = "1.0.0") # and then restart R
df <- readxl::read_xls(dest, sheet="LOCAL") # does work

I realize these are not fun issues to track down. Thanks for looking into this.

sessionInfo()
R version 3.5.1 (2018-07-02)
Platform: x86_64-apple-darwin15.6.0 (64-bit)
Running under: macOS Sierra 10.12.6

Matrix products: default
BLAS: /System/Library/Frameworks/Accelerate.framework/Versions/A/Frameworks/vecLib.framework/Versions/A/libBLAS.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/3.5/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.19      digest_0.6.16     crayon_1.3.4     
 [4] withr_2.1.2       cellranger_1.1.0  R6_2.2.2         
 [7] git2r_0.23.0      pillar_1.3.0      httr_1.3.1       
[10] rlang_0.2.2       curl_3.2          readxl_1.1.0.9000
[13] rstudioapi_0.7    devtools_1.13.6   tools_3.5.1      
[16] yaml_2.2.0        compiler_3.5.1    memoise_1.1.0    
[19] knitr_1.20        tibble_1.4.2   
jennybc commented 5 years ago

Appears to be fixed now (I've updated the embedded xls).

library(readxl)

URL <- "https://www.ers.usda.gov/webdocs/DataFiles/48731/DataDownload.xls"
dest <- "issue-509.xls"
download.file(URL, dest, mode = "wb")
read_excel(dest, sheet = "LOCAL")
#> # A tibble: 3,143 x 100
#>    FIPS  State County DIRSALES_FARMS07 DIRSALES_FARMS12 PCH_DIRSALES_FA…
#>    <chr> <chr> <chr>             <dbl>            <dbl>            <dbl>
#>  1 01001 AL    Autau…               25               51           104   
#>  2 01003 AL    Baldw…               80              103            28.7 
#>  3 01005 AL    Barbo…               18               13           -27.8 
#>  4 01007 AL    Bibb                 12               13             8.33
#>  5 01009 AL    Blount               84               88             4.76
#>  6 01011 AL    Bullo…               13               12            -7.69
#>  7 01013 AL    Butler               11               31           182.  
#>  8 01015 AL    Calho…               35               50            42.9 
#>  9 01017 AL    Chamb…               15               22            46.7 
#> 10 01019 AL    Chero…               23               14           -39.1 
#> # … with 3,133 more rows, and 94 more variables: PCT_LOCLFARM07 <dbl>,
#> #   PCT_LOCLFARM12 <dbl>, PCT_LOCLSALE07 <dbl>, PCT_LOCLSALE12 <dbl>,
#> #   DIRSALES07 <dbl>, DIRSALES12 <dbl>, PCH_DIRSALES_07_12 <dbl>,
#> #   PC_DIRSALES07 <dbl>, PC_DIRSALES12 <dbl>, PCH_PC_DIRSALES_07_12 <dbl>,
#> #   FMRKT09 <dbl>, FMRKT16 <dbl>, PCH_FMRKT_09_16 <dbl>, FMRKTPTH09 <dbl>,
#> #   FMRKTPTH16 <dbl>, PCH_FMRKTPTH_09_16 <dbl>, FMRKT_SNAP16 <dbl>,
#> #   PCT_FMRKT_SNAP16 <dbl>, FMRKT_WIC16 <dbl>, PCT_FMRKT_WIC16 <dbl>,
#> #   FMRKT_WICCASH16 <dbl>, PCT_FMRKT_WICCASH16 <dbl>, FMRKT_SFMNP16 <dbl>,
#> #   PCT_FMRKT_SFMNP16 <dbl>, FMRKT_CREDIT16 <dbl>,
#> #   PCT_FMRKT_CREDIT16 <dbl>, FMRKT_FRVEG16 <dbl>,
#> #   PCT_FMRKT_FRVEG16 <dbl>, FMRKT_ANMLPROD16 <dbl>,
#> #   PCT_FMRKT_ANMLPROD16 <dbl>, FMRKT_BAKED16 <dbl>,
#> #   PCT_FMRKT_BAKED16 <dbl>, FMRKT_OTHERFOOD16 <dbl>,
#> #   PCT_FMRKT_OTHERFOOD16 <dbl>, VEG_FARMS07 <dbl>, VEG_FARMS12 <dbl>,
#> #   PCH_VEG_FARMS_07_12 <dbl>, VEG_ACRES07 <dbl>, VEG_ACRES12 <dbl>,
#> #   PCH_VEG_ACRES_07_12 <dbl>, VEG_ACRESPTH07 <dbl>, VEG_ACRESPTH12 <dbl>,
#> #   PCH_VEG_ACRESPTH_07_12 <dbl>, FRESHVEG_FARMS07 <dbl>,
#> #   FRESHVEG_FARMS12 <dbl>, PCH_FRESHVEG_FARMS_07_12 <dbl>,
#> #   FRESHVEG_ACRES07 <dbl>, FRESHVEG_ACRES12 <dbl>,
#> #   PCH_FRESHVEG_ACRES_07_12 <dbl>, FRESHVEG_ACRESPTH07 <dbl>,
#> #   FRESHVEG_ACRESPTH12 <dbl>, PCH_FRESHVEG_ACRESPTH_07_12 <dbl>,
#> #   ORCHARD_FARMS07 <dbl>, ORCHARD_FARMS12 <dbl>,
#> #   PCH_ORCHARD_FARMS_07_12 <dbl>, ORCHARD_ACRES07 <dbl>,
#> #   ORCHARD_ACRES12 <dbl>, PCH_ORCHARD_ACRES_07_12 <dbl>,
#> #   ORCHARD_ACRESPTH07 <dbl>, ORCHARD_ACRESPTH12 <dbl>,
#> #   PCH_ORCHARD_ACRESPTH_07_12 <dbl>, BERRY_FARMS07 <dbl>,
#> #   BERRY_FARMS12 <dbl>, PCH_BERRY_FARMS_07_12 <dbl>, BERRY_ACRES07 <dbl>,
#> #   BERRY_ACRES12 <dbl>, PCH_BERRY_ACRES_07_12 <dbl>,
#> #   BERRY_ACRESPTH07 <dbl>, BERRY_ACRESPTH12 <dbl>,
#> #   PCH_BERRY_ACRESPTH_07_12 <dbl>, SLHOUSE07 <dbl>, SLHOUSE12 <dbl>,
#> #   PCH_SLHOUSE_07_12 <dbl>, GHVEG_FARMS07 <dbl>, GHVEG_FARMS12 <dbl>,
#> #   PCH_GHVEG_FARMS_07_12 <dbl>, GHVEG_SQFT07 <dbl>, GHVEG_SQFT12 <dbl>,
#> #   PCH_GHVEG_SQFT_07_12 <dbl>, GHVEG_SQFTPTH07 <dbl>,
#> #   GHVEG_SQFTPTH12 <dbl>, PCH_GHVEG_SQFTPTH_07_12 <dbl>, FOODHUB16 <dbl>,
#> #   CSA07 <dbl>, CSA12 <dbl>, PCH_CSA_07_12 <dbl>, AGRITRSM_OPS07 <dbl>,
#> #   AGRITRSM_OPS12 <dbl>, PCH_AGRITRSM_OPS_07_12 <dbl>,
#> #   AGRITRSM_RCT07 <dbl>, AGRITRSM_RCT12 <dbl>,
#> #   PCH_AGRITRSM_RCT_07_12 <dbl>, FARM_TO_SCHOOL09 <dbl>,
#> #   FARM_TO_SCHOOL13 <dbl>

Created on 2018-12-13 by the reprex package (v0.2.1.9000)

roualdes commented 5 years ago

Big up @jennybc for maintaining a great R package and Chico State's Fall 2018 MATH 385: Introduction to Data Science for discovering this issue. Thanks to you both!