Closed balazsdukai closed 4 years ago
Did you try to load the raster file directly in R using raster
? Then load it to the DB using pgWriteRast
, and then back in R using pgGetRast
?
As you know, readGDAL
loads the raster as a SpatialGridDataFrame, so the approach using raster
(which pgGetRast
is using) should be much lighter on memory…
No I didn't try it, because I don't see why would I load the raster first to R and then to the DB if its possible to load it directly to the DB?
Just as a way to isolate the source of the problem (file, R, raster
, …), and see at which step it arises.
I do see the memory leak, but I don't have the error. Steps I tried:
raster2pgsql
directly on the files (without converting to geotiff):raster2pgsql -s 4269 -I -t auto -M hdr.adf | psql -d test
Then import in R:
library("rpostgis")
(rast1 <- pgGetRast(conn, "hdr"))
class : RasterLayer
dimensions : 8053, 8237, 66332561 (nrow, ncol, ncell)
resolution : 90, 90 (x, y)
extent : 52621.16, 793951.2, 56869.21, 781639.2 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=longlat +datum=NAD83 +no_defs +ellps=GRS80 +towgs84=0,0,0
data source : in memory
names : layer
values : 0, 114 (min, max)
In this case, there is indeed a huge consumption of memory by PostgreSQL (up to nearly 4GB, but I did not monitor it constantly, could have been even more).
library("raster")
rast2 <- raster("hdr.adf")
pgWriteRast(conn, "rast_test", rast2) # Note that the function uses tiles of 103x101 pixels
(rast2 <- pgGetRast(conn, "rast_test"))
class : RasterLayer
dimensions : 8053, 8237, 66332561 (nrow, ncol, ncell)
resolution : 90, 90 (x, y)
extent : 52621.16, 793951.2, 56869.21, 781639.2 (xmin, xmax, ymin, ymax)
coord. ref. : +proj=aea +lat_1=24 +lat_2=31.5 +lat_0=24 +lon_0=-84 +x_0=400000 +y_0=0 +datum=WGS84 +units=m +no_defs +ellps=WGS84 +towgs84=0,0,0
data source : in memory
names : hdr
values : 1, 114 (min, max)
I still have a high memory usage on importing back in R, but "only" up to 2.5 GB maybe.
In the end, R uses 2.5 GB, after loading one or two of the rasters.
Note that following these steps, the rasters seem identical in R, but don't have the same projections.
I didn't get the error message (when importing into R from rasters sent to the database via both R and psql), but I do have slow execution and a big burden on the memory as well in rpostgis
. This isn't surprising, since the entire raster is being processed through memory, both ways. @balazsdukai, I'm guessing that's what your PostgreSQL error is, since that union statement would be the biggest memory hog. Did you try just running that statement in Postgres?
Because of the memory limitation (and/or the need for better methods!) these raster functions are not great for large rasters; I think it's best to always utilize the boundary
parameter when using pgGetRast
, especially large rasters.
@basille, The mismatch in projection I think is due to the use of "-s 4269" (NAD83) in your psql import statement, which isn't the original projection of the raster (it's in an equal area projection). Balazs must have re-projected to 4269 GeoTiff prior to importing.
@dnbucklin hm what do you mean by "just running that statement in Postgres"?
No activity for about two years on this issue, I'm closing it — feel free to reopen if you want to add more information to the issue.
I have a raster from here: ftp://ftp.fgdl.org/pub/state/usgsdem.zip
After converting to geotiff I loaded to PostGIS as
raster2pgsql -s 4269 -I -t auto -M ./usgsdem/florida_dem.tif | psql -d rpostgisLT
When trying to get it into R from PostGIS with
pgGetRast(conn, c("public", "florida_dem"))
, I getplus a huge memory leak.
However, this works: