OSGeo / grass

GRASS GIS - free and open-source geospatial processing engine
https://grass.osgeo.org
Other
849 stars 308 forks source link

[Bug] r.buildvrt: significant read performance issues with virtual raster maps #4345

Closed ninsbl closed 1 week ago

ninsbl commented 1 month ago

Describe the bug

I am experiencing significant performance issues with virtual rasters build with r.buildvrt over GDAL-linked (r.external) raster maps (source is in GeoTiff format) on NFS. After more testing it seems the NFS file system amplifies the issue but but there are significant performance issues also on local file systems and also with raster maps in native GRASS format...

Running r.univar on two GDAL-linked raster maps that cover my computational region takes less than a second. Using the same computational region, one r.univar run on a virtual raster of the same two raster map is by orders of magnitude slower (30 seconds to minutes).

Below you find a script to run performance tests on different file systems and with different formats. While VRT maps with raster maps in native GRASS format are sometimes faster than r.external linked GeoTiffs, performance is way worse compared to r.univar on the individual raster maps (= no VRT). So it seems the issue is reading GDAL linked raster maps through GRASS VRTs.

In debug=2 mode I see waaaaay more calls to:

when running r.univar on a VRT compared to reading the same maps not going through VRT. That is probably the main root cause...

Hints on how to identify or find possible remedy in the code would be very welcome...

To reproduce

g.gisenv set="DEBUG=2"
regs='s,0,1000
n,500,1500'

NTFS_path=/c/data/vrt_test
NFS_path=/nfs/grass/vrt_test
mkdir $NTFS_path &> /dev/null
mkdir $NFS_path &> /dev/null

# Create test data
for reg in $regs
do
  r=$(echo $reg | cut -f1 -d",")
  s=$(echo $reg | cut -f2 -d",")
  n=$(echo $reg | cut -f3 -d",")

  g.region -g n=$n s=$s w=0 e=1000 res=1
  r.external.out -r
  r.mapcalc --o --v expression="${r}_${s}_grass_ntfs=float(x()*y())"
  r.external.out format=GTiff options="compress=LZW,PREDICTOR=3" directory=$NTFS_path
  r.mapcalc --o --v expression="${r}_${s}_gtiff_ntfs=float(x()*y())"
  r.external.out format=GTiff options="compress=LZW,PREDICTOR=3" directory=$NFS_path
  r.mapcalc --o --v expression="${r}_${s}_gtiff_nfs=float(x()*y())"
done

# Run performance tests
g.region -g n=1500 s=0 w=0 e=1000 res=1
for format_type in grass_ntfs gtiff_ntfs gtiff_nfs
do
  rm /c/data/no_vrt_${format_type}.stats
  rm /c/data/vrt_${format_type}.stats
  rmaps=$(g.list type=raster pattern="*_*_${format_type}", sep=",")
  r.buildvrt --o --v input="$rmaps" output=vrt_${format_type}
  time r.univar map="$rmaps" &>> /c/data/no_vrt_${format_type}.stats
  time r.univar map=vrt_${format_type} &>> /c/data/vrt_${format_type}.stats
done

Expected behavior

VRT raster maps should be at least comparable in read performance

System description

version=8.3.1 date=2023 revision=exported build_date=2023-10-26 build_platform=x86_64-pc-linux-gnu build_off_t_size=8 libgis_revision=8.3.1 libgis_date=2023-10-26T09:06:16+00:00 proj=9.1.1 gdal=3.6.4 geos=3.11.1 sqlite=3.37.2

Additional context

GRASS GIS version: 8.5.dev behaves the same...

ninsbl commented 1 month ago

Do I have to look here: https://github.com/OSGeo/grass/blob/main/lib/raster/vrt.c#L47 Somewhere?

ninsbl commented 1 month ago

Ok. The documentation says VRTs can be build also over linked raster data with r.external. I tried now various GRASS GIS versions (7.8.8, 8.0.0, 8.2.1, all using docker) and all show the same performance issue with GDAL-linked data, especially on file systems with latency. The individual linked files are read quite fast, but combined in a VRT things get really slow...

@metzm do you have any idea if this could be fixed somehow, or is it a format limitation that we rather document in the manual?

I would be wiling to put down some effort here, but I lack C-skills and I would need some help to fix it; if possible at all...

tmszi commented 1 month ago

According .r.buildvrt module man page Reading the whole VRT is slower than reading the equivalent single raster map. Only reading small parts of the VRT provides a performance benefit.

ninsbl commented 1 month ago

Thanks, @tmszi for looking into this. The performance difference is not related to reading parts vs. entire VRT, but related to VRT with GDAL linked data vs. VRT with native GRASS data... And on a fileaystem with some latency, GDAL linked data is practically unusable in VRTs... Even the 1000×1500 pixel in the example take so long to read that one could create a temporary patched raster first and the process would still be faster. Something seems wrong here....

neteler commented 1 month ago

The performance difference is not related to reading parts vs. entire VRT, but related to VRT with GDAL linked data vs. VRT with native GRASS data...

Does perhaps @rouault have a hint here?

rouault commented 1 month ago

Does perhaps @rouault have a hint here?

not really, I'm not familiar with what r.univar does. It would be best to try first to reproduce using only GDAL command line utilities, like gdal_translate

ninsbl commented 1 month ago

Thanks, @rouault ! Will do that. The problem is not specific to r.univar though. I just used it as an example. Any reading of GDAL linked data through GRASS GIS VRTs seems affected... So, my guess is the issue is with _Rast_get_vrtrow() https://github.com/OSGeo/grass/blob/2356520814d2ab272c308af9e89c3af466c13a13/lib/raster/vrt.c#L171

rouault commented 1 month ago

So, my guess is the issue is with _Rast_get_vrtrow()

Oooooh I now read in https://grass.osgeo.org/grass84/manuals/r.buildvrt.html that a "A GRASS virtual raster can be regarded as a simplified version of GDAL's virtual raster format" . So I'm mostly incompetent to comment on GRASS VRT specificities. What is likely is that GRASS VRT might perhaps lack is the functionality of having a pool of opened VRT sources like GDAL does, which saves opening&closing them when doing repeated pixel request in neighbouring windows of interest. Just guessing in the dark... Perhaps try to use a GDAL VRT of GRASS rasters... ?

ninsbl commented 1 month ago

That again, @rouault ! Sounds like a viable alternative / workaround! I will try that!

ninsbl commented 1 month ago

Attaching two strace visualisations.

One for reading VRTs with data in native GRASS GIS format: strace_vrt_grass

And one for reading VRTs with data in linked-GDAL format: strace_vrt_gdal

In case that helps tracing down the issue...

metzm commented 1 month ago

In this particular case, there might be a mix of different reasons causing poor performance. The reasons here seem to be NFS + GDAL-linked raster maps + GRASS vrt, which in their combination might amplify performance degradation.

The two main reasons might be

  1. usage of GDALRasterIO() via Rast_gdal_raster_IO() in https://github.com/OSGeo/grass/blob/main/lib/raster/get_row.c#L205 which could be optimized by letting GDAL do the subsetting to the current region
  2. the constant opening and closing of the individual rasters in a GRASS vrt raster. This is needed to avoid having too many files open, see also "Management of open file limits" in e.g. the manual of r.series. As @rouault suggested, GRASS VRT lack the functionality of having a pool of opened VRT sources like GDAL does.

These two reasons combined with NFS could easily cause the observed performance degradation. In this case I suggest to create a GDAL VRT and link that into GRASS. However, the fastest method should be to have GRASS native rasters (maybe in a mapset on a NFS mount) and optionally build a GRASS vrt with the native GRASS rasters. As so often, it's a compromise between data duplication and IO optimization.

ninsbl commented 1 month ago

Thanks @metzm for your insights! Then I would suggest we close this issue once the known-issue for this corner case is documented in the manual.

Using GDAL VRTs for GDAL linked data works actually quite well, facilitated with: https://grass.osgeo.org/grass84/manuals/addons/r.buildvrt.gdal.html