Closed brownag closed 5 years ago
Here are a few things that I have tried (unsuccessfully) to resolve this issue:
These were all potential hunches for source of the problem. I should reiterate that the script worked without any of this sort of magic on Win7.
> traceback()
17: rgdal::getRasterData(con, offset = offs, region.dim = reg, band = object@data@band)
16: .readRasterLayerValues(x, 1, x@nrows)
15: .local(x, ...)
14: getValues(x)
13: getValues(x)
12: .readCells(x, cells, 1)
11: .cellValues(object, cells, layer = layer, nl = nl)
10: .xyValues(x, coordinates(y), ..., df = df)
9: .local(x, y, ...)
8: raster::extract(r, s)
7: raster::extract(r, s)
6: data.frame(value = raster::extract(r, s), pID = s$pID, sid = s$sid)
5: (function (r)
{
res <- data.frame(value = raster::extract(r, s), pID = s$pID,
sid = s$sid)
return(res)
})(X, ...)
4: rapply(raster.list, how = "replace", f = function(r) {
res <- data.frame(value = raster::extract(r, s), pID = s$pID,
sid = s$sid)
return(res)
})
3: sampleRasterStackByMU(mu, mu.set, mu.col, raster.list, pts.per.acre,
estimateEffectiveSampleSize = correct.sample.size)
2: withCallingHandlers(expr, warning = function(w) invokeRestart("muffleWarning"))
1: suppressWarnings(sampleRasterStackByMU(mu, mu.set, mu.col, raster.list,
pts.per.acre, estimateEffectiveSampleSize = correct.sample.size)) at #10
> memory.size()
[1] 12758.71
> memory.limit()
[1] 16185
And from help(memory.size)
:
Environment variable R_MAX_MEM_SIZE provides another way to specify the initial limit.
It appears all 16GB of my RAM are available for use... as should be default on a 64-bit installation
Solution: raster::extract()
is erroneously concluding (via canProcessInMemory()
) these large operations can be done fully in memory, causing a limit to be hit before the theoretical maximum (i.e. the amount of unallocated RAM)
Here are the default raster options in current CRAN version of raster:
> rasterOptions()
format : raster
datatype : FLT4S
overwrite : FALSE
progress : none
timer : FALSE
chunksize : 1e+08
maxmemory : 1e+10
estimatemem : FALSE
tmpdir : C:\Users\ANDREW~1.BRO\AppData\Local\Temp\Rtmp0weas5/raster/
tmptime : 168
setfileext : TRUE
tolerance : 0.1
standardnames : TRUE
warn depracat.: TRUE
header : none
Setting maxmemory
to 1E+09 resolves the issue (by setting the upper limit for an operation in memory to 1GB as opposed to 10GB:
rasterOptions(maxmemory=1E+09)
It seems that even when quite a bit of RAM is available (in the realm of 10GB or more) Windows is unable to allocate anything over ~6GB on my machine. Similar tests on Dylan's machine broke at just over 7.5GB. Setting the max memory to 1GB forces the larger operations to be done out of memory.
See this open pull request on the rspatial/raster page that proposes changes that would resolve this issue. https://github.com/rspatial/raster/pull/11
I think all reports that rely on constant-density sampling such as this should use a heuristic to estimate the best chunk size and max memory and set as needed. It appears like the intention will be for future CRAN versions of raster to have some sort of a fix for this.
All MU summary/comparison reports are going to break with raster 2.7-15 when using regional 30m data. Some options:
sampleRasterStackByMU
The fact remains that sampling CONUS 30m rasters is no longer possible with the current version of raster and our Windows group security policy.
Would you be able to test if this problem goes away with the development version of raster? Available from R-Forge or github: https://r-forge.r-project.org/R/?group_id=294 https://github.com/rspatial/raster
Hi Robert thanks for the suggestion. Is there a binary we can use? Unfortunately we don't have access to RTools or suitable compiler on our machines.
And of course, thank you for the continued development of the raster
package. USDA-NRCS staff use it daily.
Hi Dylan, You can install from here:
install.packages("raster", repos="http://R-Forge.R-project.org")
but that only works on current R (3.5.1) Robert
On Thu, Nov 1, 2018 at 8:06 AM Dylan Beaudette notifications@github.com wrote:
Hi Robert thanks for the suggestion. Is there a binary we can use? Unfortunately we don't have access to RTools or suitable compiler on our machines.
And of course, thank you for the continued development of the raster package. USDA-NRCS staff use it daily.
— You are receiving this because you commented. Reply to this email directly, view it on GitHub https://github.com/ncss-tech/soilReports/issues/95#issuecomment-435064162, or mute the thread https://github.com/notifications/unsubscribe-auth/AK8xNa0klke2W4bS9GZ0zWhK4s7Mv5sbks5uqw32gaJpZM4YAGp6 .
I reopened this issue.
I should not have touted the maxmemory 'fix' as a fix. Cutting it down that low really cripples some of the bigger operations to the point where they may never finish.
A larger extent (relative to the one where I found this issue) I was working with this morning ran for approximately 40 minutes with no end in sight. I never saw it get bogged down, but I think the lowered memory threshold can't outweigh the cost of increase read/write (that appear to be imposed by our Windows 10 configuration)
Thanks. As you say, we are stuck with a slightly older binary from r-forge (raster_2.7-8).
rasterOptions()
reports:
...
chunksize : 1e+07
maxmemory : 1e+09
...
Testing with this version reveals that extract(r, s)
with 1.2Gb raster and ~ 800k sample points takes longer than 2.5 hours. I am pretty sure that we were using this vintage of raster
last time I did this analysis and it only took 12 minutes.
I suspect that this specific problem is related to USDA policy of 2 real-time scanning tasks that are swamping all disk access.
I am really surprised that changing the maxmemory setting did not work out. Here is a windows binary package of the current version. It would be great if you could try it: https://drive.google.com/drive/folders/1REkkVqwGrCdV3iHzQkCJETslsLXJjMVn?usp=sharing
Thanks Robert. Installed and got this err:
Error: package or namespace load failed for ‘raster’ in inDL(x, as.logical(local), as.logical(now), ...):
unable to load shared object 'C:/Users/Dylan.Beaudette/Documents/R/win-library/3.4/raster/libs/x64/raster.dll'
Weird, how did you install?
install.packages('E:/temp/raster_2.8-3.zip', repos = NULL)
It installed without error, but throws an error when loading it with library()
Note that we are "stuck" at R 3.4.0.
Here are the details for the raster in question
class : RasterLayer
dimensions : 97293, 154195, 15002094135 (nrow, ncol, ncell)
resolution : 30, 30 (x, y)
extent : -2361803, 2264047, 258854.3, 3177644 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=23 +lon_0=-96 +x_0=0 +y_0=0 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0
data source : E:\gis_data\CONUS\CONUS-forms-DEB.tif
names : CONUS.forms.DEB
values : 0, 255 (min, max)
The problem was with the new release of raster
. I tested with the code below and gave up after > 30 mins.
library(raster)
# r <- raster(nrow=97293, ncol=154195, ext=extent(-2361803, 2264047, 258854.3, 3177644), crs="+proj=aea +lat_1=29.5 +lat_2=45.5 +lat_0=23 +lon_0=-96 +x_0=0 +y_0=0 +datum=NAD83 +units=m")
# x <- init(r, fun='cell', filename="c:/temp/big.tif")
library(raster)
x <- raster("c:/temp/big.tif")
xy <- sampleRegular(raster(x), 800000, xy=TRUE)
v <- extract(x, xy)
It appears that this was caused by a change in raster via this pull request. I did not profile it; but rather than speeding things up, it seems to have a dramatic opposite effect; at least in some situation. I have reverted to the previous code, and now I get this
system.time(v <- extract(x, xy))
user system elapsed
23.70 11.97 35.92
I get the same speed with the previous CRAN release.
Thanks for testing Robert, this is a huge help! I tested on my Linux machine (R 3.4.1, raster 2.6-7) and it took about 19 minutes to complete. Your machine must be a lot faster than mine. Which version should we be on the look out for?
I have submitted raster 2.8-4 to CRAN. Hopefully it will be available sometime next week. Thanks for your help and patience.
Thanks Robert for all of the help and testing. Looking forward to the new release.
The new version is on CRAN now
Crud:
install.packages('c:/Temp/raster_2.8-4.zip', repos = NULL)
library(raster)
Error: package ‘raster’ was installed by an R version with different internals; it needs to be reinstalled for use with this R version
In addition: Warning message:
package ‘raster’ was built under R version 3.5.1
We are stuck with R 3.4.0. I wonder if CRAN will build the latest raster for r-oldrel
?
@rhijmans would you be willing to make us a custom raster_2.8 for R-oldrelease, via win-builder?
https://win-builder.r-project.org/
That would help us considerably while we wait for IT to get the current version of R.
@brownag have we solved this issue? I no longer run into memory problems.
However, the raster sampling process takes 10x longer than it used to: probably related to 2x real-time scanning processes that are always running. I can watch this via task manager.
Do we still need the following:
raster::rasterOptions(maxmemory=1E+09)
?
adjusting of rasterOptions is not needed anymore. I have not run into memory problems, nor have I noticed any major differences in sampling... but I have not systematically tested the sampling speeds either.
I have been attempting to perform constant-density sampling using a set of polygons spanning a fairly broad latitudinal range in MLRA 18 (all polygons where Flanly series is a major component)
The rasters are successfully loaded into memory (where possible) and the following error message occurs during the sampling/extraction process. For instance:
Upon removing the 30m regional datasets (just using 800m PRISM data) the overflow behavior is not observed.
This does not appear to be an inherent problem with the code ... because it runs fine with a smaller input dataset. It may not scale well. However, this report was previously running fine (if a bit slow to sample) under Windows 7 with all of the larger 10m or 30m rasters included in the sampling stack.
I'll continue to try and trace this issue.