Open ar-siddiqui opened 2 months ago
Loading the same layer in QGIS takes 30 seconds
yes, because QGIS zooms out on the layer, and this VRT has no external overviews. There is nothing QGIS can do about that. This is pretty much a "defect" on the data producer side. They should run "gdaladdo" on their VRT
Loading the same layer in QGIS takes 30 seconds
yes, because QGIS zooms out on the layer, and this VRT has no external overviews. There is nothing QGIS can do about that. This is pretty much a "defect" on the data producer side. They should run "gdaladdo" on their VRT
Thanks for the reply. Is there a reason for QGIS to zoom out on the layer and read whole raster, when you are loading it after zooming in on a small area?
And is that efficient?
It timed how long it takes to open the VRT sources using a QgsRasterLayer an a gdal.Dataset. Even with "loadDefaultStyle=False", the creation of the QgsRasterLayer object for a single VRT source takes much more time compared to the QgsRaterLayer object. It looks like the creation of QgsRasterLayer already starts a time-consuming operation.
import datetime
import os.path
import re
from pathlib import Path
from qgis.core import QgsRasterLayer
from osgeo import gdal
path_vrt = 'USGS_Seamless_DEM_13.vrt'
sources = []
rx = re.compile(r'<SourceFilename relativeToVRT="0">(?P<source>.*)</SourceFilename>')
with open(path_vrt) as f:
for line in f.readlines():
if match := rx.search(line):
sources.append( match.group('source'))
n = len(sources)
for i, src in enumerate(sources):
print(f'Test {i+1}/{n}: {os.path.basename(src)} ', end='')
t1 = datetime.datetime.now()
lyr = QgsRasterLayer(src, options=QgsRasterLayer.LayerOptions(loadDefaultStyle=False))
t2 = datetime.datetime.now()
ds = gdal.Open(src)
assert isinstance(ds, gdal.Dataset)
t3 = datetime.datetime.now()
print(f'QgsRasterLayer {(t2-t1).total_seconds()}s\tgdal.Dataset {(t3-t2).total_seconds()}s')
Test 1/1432: USGS_13_n06e162.tif QgsRasterLayer 10.141198s gdal.Dataset 0.0s Test 2/1432: USGS_13_n06e163.tif QgsRasterLayer 9.349241s gdal.Dataset 0.00052s Test 3/1432: USGS_13_n07e134.tif QgsRasterLayer 10.528733s gdal.Dataset 0.000998s Test 4/1432: USGS_13_n07e151.tif QgsRasterLayer 10.345912s gdal.Dataset 0.0s Test 5/1432: USGS_13_n07e152.tif QgsRasterLayer 10.822737s gdal.Dataset 0.002001s Test 6/1432: USGS_13_n07e158.tif QgsRasterLayer 10.244739s gdal.Dataset 0.002s Test 7/1432: USGS_13_n08e134.tif QgsRasterLayer 10.787935s gdal.Dataset 0.000995s Test 8/1432: USGS_13_n08e151.tif QgsRasterLayer 10.344938s gdal.Dataset 0.002752s Test 9/1432: USGS_13_n08e152.tif QgsRasterLayer 9.913716s gdal.Dataset 0.0s Test 10/1432: USGS_13_n08e158.tif QgsRasterLayer 9.050872s gdal.Dataset 0.0s Test 11/1432: USGS_13_n09e134.tif QgsRasterLayer 8.897478s gdal.Dataset 0.001545s Test 12/1432: USGS_13_n10e138.tif QgsRasterLayer 9.423416s gdal.Dataset 0.000994s
@rouault I breakpointed the loading times down to a GDALOpenEx
in qgsgdalproviderbase.cpp , Line 314 (QgsGdalProviderBase::gdalOpen):
and I can reproduce it in python:
from osgeo import gdal
gdal.UseExceptions()
# long loading VRT
uri1 = r'/vsicurl/https://rockyweb.usgs.gov/vdelivery/Datasets/Staged/Elevation/13/TIFF/USGS_Seamless_DEM_13.vrt'
# same VRT, but downloaded
uri2 = r'USGS_Seamless_DEM_13.vrt' # local VRT
# 1st VRT source
uri3 = r'/vsicurl/https://prd-tnm.s3.amazonaws.com/StagedProducts/Elevation/13/TIFF/current/n39w105/USGS_13_n39w105.tif'
for i, uri in enumerate([uri1, uri2, uri3]):
t0 = datetime.datetime.now()
ds = gdal.OpenEx(uri, 0, [], [], [] )
print(f'uri{i+1}: {datetime.datetime.now() - t0}')
Output:
uri1: 0:05:22.600787 uri2: 0:00:00.046641 uri3: 0:00:00.944662
What is the bug or the crash?
I have a VRT with 70k sources,
gdalinfo
on it takes 3 4 seconds, when loading the same VRT on QGIS it can take up to 20 minutes (this is the time to load layer in the project, the time to fetch data is separate which is dependent on how zoomed out you are). Once the VRT is loaded, and working zoomed in, it takes a reasonable amount of time to fetch data. It is only at the load of a layer that the QGIS takes unreasonable time.Steps to reproduce the issue
Unfortunately, my VRT has sources in a private S3 bucket so it can't be shared for reproducibility but here is a public VRT that shows similar behavior.
/vsicurl/https://rockyweb.usgs.gov/vdelivery/Datasets/Staged/Elevation/13/TIFF/USGS_Seamless_DEM_13.vrt
Time to perform gdalinfo:
Loading the same layer in QGIS takes 30 seconds. Then there are additional second or two to fetch actual data, which is reasonable. (Please work at high zoom level with this data or QGIS will try to fetch data for large areas)
Versions
QGIS 3.38.2 GDAL 3.9.2
Supported QGIS version
New profile
Additional context
No response