Closed sigmafelix closed 1 year ago
future.batchtools
seems to provide hassle-free solutions to submit jobs in r-future
in HPC. Although learning low-level controls in SLURM must be helpful, I switch the short-term development target to employ future.batchtools
to streamline development processes. #7 is also affected.
library(terra)
library(future.apply)
library(scomps)
library(dplyr)
us_extent =
terra::ext(c(xmin = -129.5, xmax = -61.1, ymin = 19.5, ymax = 51.8))
extract_with_buffer.flat <- function(
points, surf, radius, id, qsegs, func = mean, kernel = NULL, bandwidth = NULL
) {
# generate buffers
bufs = terra::buffer(points, width = radius, quadsegs = qsegs)
# crop raster
bufs_extent = terra::ext(bufs)
surf_cropped = terra::crop(surf, bufs_extent)
name_surf_val = names(surf)
# extract raster values
surf_at_bufs = terra::extract(surf_cropped, bufs)
surf_at_bufs_summary =
surf_at_bufs |>
group_by(ID) |>
summarize(across(all_of(name_surf_val), ~mean(.x, na.rm = TRUE))) |>
ungroup()
return(surf_at_bufs_summary)
}
pathappend = "/ddn/gs1/home/songi2/projects/Scalable_GIS/largedata/"
merraname = "MERRA2_400.tavg1_2d_aer_Nx.20220820.nc4"
pointname = "aqs-test-data.gpkg"
merra = terra::rast(paste(pathappend, merraname, sep= ""), win = us_extent)
point = terra::vect(paste(pathappend, pointname, sep = ""))
timeindex = rep(seq_along(varnames(merra)), each = 24)
merra_daily = terra::tapp(merra, index = timeindex, fun = mean)
merra_daily
names(merra_daily) = varnames(merra)
merra_daily
targ_cols = c("BCCMASS",
"BCSMASS",
"DMSCMASS",
"DMSSMASS",
"DUCMASS",
"DUSMASS",
"DUCMASS25",
"DUSMASS25",
"OCCMASS",
"OCSMASS",
"SO2CMASS",
"SO2SMASS",
"SO4CMASS",
"SO4SMASS",
"SSCMASS",
"SSSMASS",
"SSCMASS25",
"SSSMASS25")
merra_daily_t = merra_daily[[targ_cols]]
extracted = extract_with_buffer.flat(point, merra_daily_t, "ID.Code",
radius = 2e4L, qsegs = 90L)
write.csv(extracted, "res_merra_09152023.csv")
In SSH, sbatch terra_runs_Rcode_file.sh
returned an error message of:
sbatch: error: Batch job submission failed: Invalid account or account/partition combination specified
Will ask OSC about HPC job submission authorization
As we communicated in the email chain, terra is unavailable in the HPC for the time being. Next submission is TBD. In the meantime, I will test how full sf-stars implementation would perform in the HPC, even though the local test was not very promising.
From experience for a couple of days, I learned that the parallelization in HPC needs to be performed with a strategy. The basic notion is to run calculation codes on containers with all required packages. What I parallelize at should be determined 1) data size, 2) memory pressure, 3) number of features/cells used per loop/list element. Test run is completed, so I close this issue now. Will open a new issue if there is any HPC-related coming up.
Objective
rslurm
List of tasks and timeline