tidyverse / readxl

Read excel files (.xls and .xlsx) into R 🖇
https://readxl.tidyverse.org
Other
730 stars 195 forks source link

On Windows, readxl::read_excel throws an Invalid Unicode Point 56573 #759

Open hidekoji opened 2 weeks ago

hidekoji commented 2 weeks ago

I tired to import an Excel file which contains 1689493557000_xDCfD_yQCI1-dSk9PX1V2nzZDWgA as document ID but it failed with Invalid Unicode Point 56573.

# Required libraries
    library(httr)
    library(readxl)

    # Define the URL for the Excel file and local destination path
    url <- "https://www.dropbox.com/scl/fi/el8ftfnopwvsaruo8lc3w/load-issue-original-03.xlsx?rlkey=k7sajzb2ek9vmfdthu9oufu6v&dl=1"
    destfile <- tempfile(fileext = ".xlsx")  # Temporary file to save the Excel file

    # Download the Excel file
    GET(url, write_disk(destfile, overwrite = TRUE))
#> Response [https://uc4ba5df6123fe98dd496336cda2.dl.dropboxusercontent.com/cd/0/get/CcGdNLWKfEjwI0ZpCQHjKawvx8cH64OrkGPgiuAjWZmgi1bWmmr6y8BLghzUuaPiaevhUV_Y0EWQQ4v4WKGD3BWYFb4noNu5-21xdoRW9VRnPmZMbZWOXQM0T4_bcqw8fZCOjs4vaqByT2HE6cjOrQya/file?dl=1]
#>   Date: 2024-10-08 17:05
#>   Status: 200
#>   Content-Type: application/binary
#>   Size: 8.85 kB
#> <ON DISK>  C:\Users\hidek\AppData\Local\Temp\RtmpQbaLxo\file3b0cf75e5e.xlsx

    # Read the Excel file
    excel_data <- read_excel(destfile)
#> Error in read_fun(path = path, sheet_i = sheet, limits = limits, shim = shim, : invalid Unicode point 56573

    # Display the first few rows of the data
    head(excel_data)
#> Error in eval(expr, envir, enclos): object 'excel_data' not found

Created on 2024-10-08 with reprex v2.1.1

hidekoji commented 2 weeks ago
sessionInfo()
#> R version 4.4.0 (2024-04-24 ucrt)
#> Platform: x86_64-w64-mingw32/x64
#> Running under: Windows 11 x64 (build 22631)
#> 
#> Matrix products: default
#> 
#> 
#> locale:
#> [1] LC_COLLATE=Japanese_Japan.utf8  LC_CTYPE=Japanese_Japan.utf8   
#> [3] LC_MONETARY=Japanese_Japan.utf8 LC_NUMERIC=C                   
#> [5] LC_TIME=Japanese_Japan.utf8    
#> 
#> time zone: America/Los_Angeles
#> tzcode source: internal
#> 
#> attached base packages:
#> [1] stats     graphics  grDevices utils     datasets  methods   base     
#> 
#> loaded via a namespace (and not attached):
#>  [1] digest_0.6.35     fastmap_1.1.1     xfun_0.43         glue_1.7.0       
#>  [5] knitr_1.46        htmltools_0.5.8.1 rmarkdown_2.26    lifecycle_1.0.4  
#>  [9] cli_3.6.2         reprex_2.1.1      withr_3.0.0       compiler_4.4.0   
#> [13] rstudioapi_0.16.0 tools_4.4.0       evaluate_0.23     yaml_2.3.8       
#> [17] rlang_1.1.3       fs_1.6.4

Created on 2024-10-08 with reprex v2.1.1