r-spatial / sf

Simple Features for R
https://r-spatial.github.io/sf/
Other
1.35k stars 299 forks source link

Ram usage with WARP is too big! while from python it works fine [Elevation Profile] #2354

Closed latot closed 8 months ago

latot commented 8 months ago

Hi!, well I don't know if this is from R, Python, maybe I did an error writing the code, but is too weird!

I was porting the elevation profile code from: https://kokoalberti.com/articles/creating-elevation-profiles-with-gdal-and-two-point-equidistant-projection/

To R, is not very hard, here a example:

from_point <- c(-77.0364, 38.8951)
to_point <- c(-78.0364, 39.8951)
dist <- 0.1
width <- 0.1
method <- "bilinear"
seg_id <- 1
#Must be in float64
ds <- "some raster.tiff"
proj_str <- paste0(
  "+proj=tpeqd +lon_1=",
  from_point[1],
  " +lat_1=",
  from_point[2],
  " +lon_2=",
  to_point[1],
  " +lat_2=",
  to_point[2]
)
points_tpeqd <-
  sf::st_sfc(
    sf::st_point(from_point),
    sf::st_point(to_point),
    crs = "+proj=latlon"
  ) %.>%
  sf::st_transform(., proj_str)
point_1 <- points_tpeqd[[1]]
point_2 <- points_tpeqd[[2]]
bbox <- c(point_1[1], -(width * 0.5), point_2[1], (width * 0.5))
num_samples <- max(2, as.integer(ceiling((point_2[1] - point_1[1]) / dist)))
dx <- (point_2[1] - point_1[1]) / num_samples
tmp_ds <- paste0("/vsimem/test_", as.character(seg_id), ".tif")
profile <- sf::gdal_utils(
  util = "warp",
  ds,
  tmp_ds,
  config_options = c(
    dstSRS = proj_str,
    outputBounds = as.character(bbox),
    height = as.character(1),
    width = as.character(num_samples),
    resampleAlg = method
  )
)

Most of the parameters comes from the doc, nothing weird about the methdology it self, so before continues, I run this code in python, it work nice, fine, almost no ram or cpu consumption, the code is something similar, but does the same:

from pyproj import CRS, Transformer
from osgeo import gdal
import pyproj.exceptions
import math
import multiprocessing
import alive_progress
import pathos.pools
import pandas
import shapely
import geopandas

gdal.UseExceptions()

from_point <- [-77.0364, 38.8951]
to_point <- [-78.0364, 39.8951]
dist <- 0.1
width <- 0.1
method <- "bilinear"
seg_id <- 1
ds = gdal.Open(ds)
# Set up coordinate transform
#print(from_point)
#print(to_point)
proj_str = "+proj=tpeqd +lon_1={} +lat_1={} +lon_2={} +lat_2={}".format(
  from_point[0], from_point[1],
  to_point[0], to_point[1])
tpeqd = CRS.from_proj4(proj_str)
transformer = Transformer.from_crs(CRS.from_proj4("+proj=latlon"), tpeqd)
point_1 = transformer.transform(from_point[0], from_point[1])
point_2 = transformer.transform(to_point[0], to_point[1])
bbox = (point_1[0], -(width*0.5), point_2[0], (width*0.5))
num_samples = max(2, int(math.ceil((point_2[0] - point_1[0]) / dist)))
dx = (point_2[0] - point_1[0])/num_samples
tmp_ds = "/vsimem/test_{}.tif".format(seg_id)
profile = gdal.Warp(tmp_ds, ds, dstSRS=proj_str, outputBounds=bbox, 
                    height=1, width=num_samples, resampleAlg=method)

Sadly I don't have a raster, but you can use any raster with Float64, here is what happens in R:

Error in sf::gdal_utils(util = "warp", ds, tmp_ds, config_options = c(dstSRS = proj_str,  : 
  gdal_utils warp: an error occured
Además: Warning message:
In CPL_gdalwarp(source, destination, options, oo, doo, config_options,  :
  GDAL Error 3: /vsimem/test_1.tif: Free disk space available is 32960012288 bytes, whereas 55237299552 are at least necessary. You can disable this check by defining the CHECK_DISK_FREE_SPACE configuration option to FALSE.

If we disable the CHECK_DIS_FREE_SPACE, then it starts eating the ram a lot! No idea what happens.

Thx!

Here is the info

sessionInfo()
R version 4.3.2 (2023-10-31)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: Gentoo Linux

Matrix products: default
BLAS:   /usr/lib64/libblas.so.3.11.0 
LAPACK: /usr/lib64/liblapack.so.3.11.0

locale:
 [1] LC_CTYPE=es_CL.utf8       LC_NUMERIC=C             
 [3] LC_TIME=es_CL.utf8        LC_COLLATE=es_CL.utf8    
 [5] LC_MONETARY=es_CL.utf8    LC_MESSAGES=es_CL.utf8   
 [7] LC_PAPER=es_CL.utf8       LC_NAME=C                
 [9] LC_ADDRESS=C              LC_TELEPHONE=C           
[11] LC_MEASUREMENT=es_CL.utf8 LC_IDENTIFICATION=C      

time zone: Chile/Continental
tzcode source: system (glibc)

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] wrapr_2.1.0

loaded via a namespace (and not attached):
 [1] utf8_1.2.4         R6_2.5.1           tidyselect_1.2.0   e1071_1.7-14      
 [5] magrittr_2.0.3     glue_1.7.0         tibble_3.2.1       KernSmooth_2.23-22
 [9] pkgconfig_2.0.3    generics_0.1.3     dplyr_1.1.4        lifecycle_1.0.4   
[13] classInt_0.4-10    sf_1.0-16          cli_3.6.2          fansi_1.0.6       
[17] grid_4.3.2         vctrs_0.6.5        DBI_1.2.1          proxy_0.4-27      
[21] class_7.3-22       compiler_4.3.2     tools_4.3.2        pillar_1.9.0      
[25] Rcpp_1.0.12        rlang_1.1.3        units_0.8-5 
edzer commented 8 months ago

If you want anyone to look into this, please provide a complete reprex, including library() statements, data, and avoid things like this

Error in sf::st_sfc(sf::st_point(from_point), sf::st_point(to_point),  : 
  could not find function "%.>%"

when it is not needed.

latot commented 8 months ago

Sorry, I thought would be easier, tomorrow I'll create full reprex, need some data to use and share.

latot commented 8 months ago

Hi, in the end, there was no problem with this, there is some weird things with some values, but not this.

I just got confused about params between python and command line ones.

Thx!