Closed k5cents closed 1 year ago
Apologies, I see #65 now. I tried installing v2.0.4 per a comment there and get the same problem.
How many files are there in this ZIP file? I'm appending files to an existing zip file and I notice that it won't open once the number of the files in the archive gets to 65535 - which might be the actual issue.
@awd97 Nowhere near that many. Only 25, but each is about 1GB in size.
I'm also having the same issue and inside my zip there is only one file:
library(zip)
file_download_data <- tempfile()
#Download dataset----
site.covid <- paste0(
"http://datosabiertos.salud.gob.mx/gobmx/salud",
"/datos_abiertos/datos_abiertos_covid19.zip"
)
download.file(site.covid, file_download_data, method = "curl")
zip::unzip(file_download_data)
This results in
Error in zip::unzip(file_download_data) :
zip error: `Cannot extract entry `220411COVID19MEXICO.csv` from archive `/tmp/RtmpKnsWCA/file133dc5188912a`` in file `zip.c:219`
Doing
system2("unzip", args = c("-o",file_download_data))
works fine.
Attaching my session:
> sessionInfo()
R version 4.1.3 (2022-03-10)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Ubuntu 20.04.4 LTS
Matrix products: default
BLAS: /usr/lib/x86_64-linux-gnu/blas/libblas.so.3.9.0
LAPACK: /usr/lib/x86_64-linux-gnu/lapack/liblapack.so.3.9.0
locale:
[1] LC_CTYPE=en_US.UTF-8 LC_NUMERIC=C LC_TIME=es_MX.UTF-8
[4] LC_COLLATE=en_US.UTF-8 LC_MONETARY=es_MX.UTF-8 LC_MESSAGES=en_US.UTF-8
[7] LC_PAPER=es_MX.UTF-8 LC_NAME=C LC_ADDRESS=C
[10] LC_TELEPHONE=C LC_MEASUREMENT=es_MX.UTF-8 LC_IDENTIFICATION=C
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] zip_2.2.0.9000 RMariaDB_1.2.1
loaded via a namespace (and not attached):
[1] bit_4.0.4 compiler_4.1.3 ellipsis_0.3.2 cli_3.2.0 hms_1.1.1
[6] DBI_1.1.2 tools_4.1.3 Rcpp_1.0.8.3 bit64_4.0.5 vctrs_0.4.0
[11] lifecycle_1.0.1 pkgconfig_2.0.3 rlang_1.0.2
I'm having a similar problem extracting zip files like https://climatedata-beta.environment.nsw.gov.au/download-collection/ae2c99ac-5ef1-44ef-abf9-10d63082f739 (about 3.93 GB to download, with 569 NetCDF .nc
files expected inside):
unzip
in bash extract it fineutils::unzip(list = TRUE)
gives the full contents, but extracting only extracts about half the files, with the warning "zip file is corrupt"zip::unzip()
gives "Cannot open zip file for reading in file `zip.c:140`"zip::zip_list()
gives "Cannot open zip file"This is on an M1 Mac:
> sessionInfo()
R version 4.1.2 (2021-11-01)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Monterey 12.6.1
Matrix products: default
BLAS: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRblas.0.dylib
LAPACK: /Library/Frameworks/R.framework/Versions/4.1-arm64/Resources/lib/libRlapack.dylib
locale:
[1] en_AU.UTF-8/en_AU.UTF-8/en_AU.UTF-8/C/en_AU.UTF-8/en_AU.UTF-8
attached base packages:
[1] stats graphics grDevices utils datasets methods base
other attached packages:
[1] zip_2.2.2
loaded via a namespace (and not attached):
[1] compiler_4.1.2
I should also mention that getOption("unzip")
reports "/usr/bin/unzip"
... which I also get from which unzip
. I'm not sure why I get different results with unzipping in bash versus unzipping with utils::unzip
.
EDIT: my mistake! utils::unzip
defaults to unzip = "internal"
, not getOption("unzip")
. Using utils::unzip(file, unzip = getOption("unzip"))
works for me!
Fixed by #79.
I am unable to get this large ZIP file to open so I can list the files inside. I can list with
utils::unzip(list = TRUE)
when using the/usr/bin/unzip
internal method from Ubuntu 20.04. All the files in the ~3GB archive should be around 1GB for a total of ~24GB.I also have a problem when extracting the files using either function, although using
system2()
to invoke theunzip
command manually seems to work for me.