swish-climate-impact-assessment / awaptools

Australian Water Availability Project tools (awaptools): a set of functions to aid downloading and reformatting
6 stars 4 forks source link

optimize compressed storage of downloaded data into GTIFF #1

Open ivanhanigan opened 8 years ago

ivanhanigan commented 8 years ago

from @ahsparks

writeRaster(raster, file = "out location", format = "GTiff", dataType = "INT2S",

options = c("COMPRESS=LZW"))

to save space with GTiffs. Check all the options with dataType to see what works best. http://artax.karlin.mff.cuni.cz/r-help/library/raster/html/dataType.html

ivanhanigan commented 8 years ago

@ahsparks I added your code as function compress_gtifs to the develop branch, and added you as author to package.

Started demo of the workflow in development https://github.com/swish-climate-impact-assessment/awaptools/blob/develop/README.md

Cheers!

adamhsparks commented 8 years ago

I completely missed this, @ivanhanigan.

Thanks! 👍

ivanhanigan commented 8 years ago

@adamhsparks I wonder what you think about

  1. not creating Gtifs subdir
  2. removing the .grid file after

Today this was what I wanted to do so I hacked the following, my thought though is it is better to make your compress_gtifs function do the unzip, compress and clean up. https://github.com/swish-climate-impact-assessment/AWAP_GRIDS/commit/641140304eff38a215223d306acd0a5e8ccea89c

require(devtools)
install_github("swish-climate-impact-assessment/awaptools", ref = "develop")
require(awaptools)
require(rgdal)
workdir <- "data"
setwd(workdir)
startdate <- "2014-01-01"
enddate <- "2014-02-28"
load_monthly(start_date = startdate, end_date = enddate)
filelist <- dir(pattern = "grid.Z$")
for(fname in filelist){unzip_monthly(fname, aggregation_factor = 1)}
compress_gtifs(indir = getwd())
system("rm *.grid")
system("mv GTif/* ./")
adamhsparks commented 8 years ago

@ivanhanigan is there any reason to add the extra step to generate the GTiff files rather than just going straight to that step during the download? You're not giving the user an option to keep the .grid file if the first step only downloads and the second step/function does the unzipping and writes the GTiff file onto disk.

If you do that we could download the .Z files to a tempdirectory , ?tempdir, unzip them in that tempdirectory, import them using the raster package and then write a GTiff file out to disk. Once the session is done, the .grid.Z and .grid file disappear, no need to unlink them.

ivanhanigan commented 8 years ago

@adamhsparks Thanks, yes - no need to keep grid file so I'll try to implement tempdir as you suggest. Cheers. On 28/08/2016 6:34 PM, "Adam H. Sparks" notifications@github.com wrote:

@ivanhanigan https://github.com/ivanhanigan is there any reason to keep the .grid files around and add the extra step to generate the GTiff files rather than just going straight to that step during the download? You're not giving the user an option to keep the .grid file if the first step only downloads and the second step/function does the unzipping and writes the GTiff file onto disk.

If you do that we could download the .Z files to a tempdirectory , ?tempdir, unzip them in that tempdirectory, import them using the raster package and then write a GTiff file out to disk. Once the session is done, the .grid.Z and .grid file disappear, no need to unlink them.

— You are receiving this because you were mentioned. Reply to this email directly, view it on GitHub https://github.com/swish-climate-impact-assessment/awaptools/issues/1#issuecomment-242963176, or mute the thread https://github.com/notifications/unsubscribe-auth/ABPsp6dqOFXQC7Idk4rlhawEbqna8vvUks5qkUgvgaJpZM4IP5Vu .