Open franknarf1 opened 6 years ago
Since it took me ages to get this to work (windows-only, requires 7z), here's the multifile version:
fread_targzs = function(fp, zp = "C:/Program Files/7-Zip/7z.exe", unzip_dir = dirname(fp), silent = FALSE){
# only tested on windows
# fp should be the path to mycsvs.tar.gz
# zp should be the path to 7z.exe
# unzip_dir should be used only for CSVs from inside targz
# thanks to Joachim Sauer: https://superuser.com/a/1283392/
# see original single-csv function on https://github.com/franknarf1/r-tutorial/issues/29
qzp = zp %>% normalizePath(mustWork = FALSE) %>% shQuote
qfp = fp %>% normalizePath %>% shQuote
quz = unzip_dir %>% normalizePath %>% shQuote
# unzip
patt = "%s x -y -so %%s | %1$s x -y -si -ttar -o%%s"
thecall = patt %>% sprintf(qzp) %>% sprintf(qfp, quz)
if (!silent) cat("The targz unzip call:", thecall, sep="\n")
shell(sprintf("\"%s\"", thecall))
# list files, read separately
# not looking recursively, since csvs should be only one level deep
# still need to discuss conventions like this with data team
fns = list.files(unzip_dir) %>% setNames(., .)
if (!all(tools::file_ext(fns) == "csv")) stop("fp should contain only CSVs")
lapply(fns %>% file.path(unzip_dir, .), fread)
}
Eg, to read tar.gz
Fits ahead of "Tables > Input and output > Reading and writing other formats".