noaa-ocs-hydrography / BlueTopo

BlueTopo is a compilation of the best available bathymetric data of U.S. waters. This package simplifies using that data.
https://www.nauticalcharts.noaa.gov/data/bluetopo.html
Creative Commons Zero v1.0 Universal
11 stars 2 forks source link

Speed up generation of UTM VRTs #19

Open aaime opened 11 months ago

aaime commented 11 months ago

The generation of the top level UTM VRTs is taking hours (at least on my hardware). I'm wondering if there's anything that can be done to speed it up.

On one side, I see there is a large histogram in all UTM file, taking some 20% of the file size, mostly filled with zeroes. I'd guess that in order to compute it, GDAL has to read all actual source pixels (the other VRTs do not seem to have it). Is the histogram actually important? Can it be skipped?

The unified RAT of course takes time, but it's kind of required, and the algorithm to build it appears to be efficient already (single pass with a lookup containing the already encountered rows, if I read correctly). Although, I don't see the RATs in the intermediate VRT files, which could maybe help to avoid opening all the source files once more (not sure, just guessing).

The last bit is the overviews, which also feels quite slow, I can see the script process as the ovr file grows. I've done a bit of debugging:

1) Using CPL_DEBUG=on, I've verified that the process is indeed opening every single source TIF file, e.g.:

GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_0923/bluetopo_tiles/BlueTopo/UTM20/BlueTopo_BH55T4Z7_20230512.tiff, this=0x55948f69ac60) succeeds as GTiff.
GDAL: GDALClose(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_0923/bluetopo_tiles/BlueTopo/UTM20/BlueTopo_BH55G4ZJ_20220926.tiff, this=0x55944545c010)
GDAL: 23 block reads on 20 block band 1 of /home/aaime/devel/gisData/noaa/bluetopo/bluetopo_0923/bluetopo_tiles/BlueTopo/UTM20/BlueTopo_BH55G4ZJ_20220926.tiff.
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_0923/bluetopo_tiles/BlueTopo/UTM20/BlueTopo_BH55S4Z7_20230512.tiff, this=0x55944545c010) succeeds as GTiff.
GDAL: GDALClose(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_0923/bluetopo_tiles/BlueTopo/UTM20/BlueTopo_BH55H4ZJ_20220926.tiff, this=0x55948059a080)
GDAL: 24 block reads on 20 block band 1 of /home/aaime/devel/gisData/noaa/bluetopo/bluetopo_0923/bluetopo_tiles/BlueTopo/UTM20/BlueTopo_BH55H4ZJ_20220926.tiff.
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_0923/bluetopo_tiles/BlueTopo/UTM20/BlueTopo_BH55V4Z7_20230512.tiff, this=0x55948059a080) succeeds as GTiff.
GDAL: GDALClose(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_0923/bluetopo_tiles/BlueTopo/UTM20/BlueTopo_BH55J4ZH_20220926.tiff, this=0x55948c06bfc0)
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_0923/bluetopo_tiles/BlueTopo/UTM20/BlueTopo_BF2K92JB_20220926.tiff, this=0x55948c06bfc0) succeeds as GTiff.
GDAL: GDALClose(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_0923/bluetopo_tiles/BlueTopo/UTM20/BlueTopo_BH55K4ZH_20220926.tiff, this=0x55948f319260)
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_0923/bluetopo_tiles/BlueTopo/UTM20/BlueTopo_BH55W4Z7_20220926.tiff, this=0x55948f319260) succeeds as GTiff.
GDAL: GDALClose(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_0923/bluetopo_tiles/BlueTopo/UTM20/BlueTopo_BH55J4ZJ_20220926.tiff, this=0x55947188e480)
GDAL: 23 block reads on 20 block band 1 of /home/aaime/devel/gisData/noaa/bluetopo/bluetopo_0923/bluetopo_tiles/BlueTopo/UTM20/BlueTopo_BH55J4ZJ_20220926.tiff.
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_0923/bluetopo_tiles/BlueTopo/UTM20/BlueTopo_BH55X4Z7_20220926.tiff, this=0x55947188e480) succeeds as GTiff.
GDAL: GDALClose(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_0923/bluetopo_tiles/BlueTopo/UTM20/BlueTopo_BH55K4ZJ_20220926.tiff, this=0x55945150f390)
GDAL: 24 block reads on 20 block band 1 of /home/aaime/devel/gisData/noaa/bluetopo/bluetopo_0923/bluetopo_tiles/BlueTopo/UTM20/BlueTopo_BH55K4ZJ_20220926.tiff.
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_0923/bluetopo_tiles/BlueTopo/UTM20/BlueTopo_BH55Z4Z7_20220926.tiff, this=0x55945150f390) succeeds as GTiff.
GDAL: GDALClose(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_0923/bluetopo_tiles/BlueTopo/UTM20/BlueTopo_BH55M4ZH_20230512.tiff, this=0x559491fd6230)
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_0923/bluetopo_tiles/BlueTopo/UTM20/BlueTopo_BF2KB2JB_20220926.tiff, this=0x559491fd6230) succeeds as GTiff.
GDAL: GDALClose(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_0923/bluetopo_tiles/BlueTopo/UTM20/BlueTopo_BH55M4ZJ_20220926.tiff, this=0x559488632f20)
GDAL: 24 block reads on 20 block band 1 of /home/aaime/devel/gisData/noaa/bluetopo/bluetopo_0923/bluetopo_tiles/BlueTopo/UTM20/BlueTopo_BH55M4ZJ_20220926.tiff.
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_0923/bluetopo_tiles/BlueTopo/UTM20/BlueTopo_BH5624Z7_20220926.tiff, this=0x559488632f20) succeeds as GTiff.

2) Using pyrasite-shell I've verified that while that happens, the process is doing this:

  File "/home/aaime/miniconda3/envs/bluetopo_env/bin/build_vrt", line 8, in <module>
    sys.exit(build_vrt_command())
  File "/home/aaime/miniconda3/envs/bluetopo_env/lib/python3.11/site-packages/nbs/bluetopo/cli.py", line 29, in build_vrt_command
    vrt(root = args.dir, target = args.target)
  File "/home/aaime/miniconda3/envs/bluetopo_env/lib/python3.11/site-packages/nbs/bluetopo/build_vrt.py", line 644, in main
    build_vrt(vrt_list, utm_vrt, [32,64])
  File "/home/aaime/miniconda3/envs/bluetopo_env/lib/python3.11/site-packages/nbs/bluetopo/build_vrt.py", line 220, in build_vrt
    vrt.BuildOverviews("NEAREST", levels)
  File "/home/aaime/miniconda3/envs/bluetopo_env/lib/python3.11/site-packages/osgeo/gdal.py", line 3031, in BuildOverviews
    return _gdal.Dataset_BuildOverviews(self, *args, **kwargs)

So it looks like the culprit is indeed overview creation... however, the other VRTs also have overviews, I would have expected the "add overviews" process to just build on the overviews of the single block VRT file... and yet, it does not seem to be happening, which is a pity. For reference, a "normal" gdaladdo builds the first overview from the actual pixels, but the second overview is build using the first overview pixels, and so on, speeding up the process dramatically.

pgeleg commented 11 months ago

Thanks for this. Will look more into it.

There is a lot of performance left on the table as it stands. There is no multithreading at all at the moment, which both fetch_tiles and build_vrt can benefit from.

I did a quick crude implementation of multithreading for fetch_tiles which would be I/O bound. Fetching all 3602 currently available tiles, my personal machine went from 45 minutes to 18 minutes with that implementation.

I think there would be significant performance gains for build_vrt as well. A lot of CPU is left on the table.

aaime commented 11 months ago

Quick note, you might want to have a look at the VRT_VIRTUAL_OVERVIEWS flag, see also the OverviewList element at https://gdal.org/drivers/raster/vrt.html

GlenRice-NOAA commented 11 months ago

Thanks for the thought! We are distracted by other priorities at the moment, but we hope to get back to the VRT generation next month.

aaime commented 10 months ago

Related to this topic, there is the actual usage of the overviews in the VRTs. Using QGIS, I seem to notice inefficient usage of the intermediate VRT overview files.

To test I'm using QGIS, started from command line, with export CPL_DEBUG=on right before starting it. This makes it output debug information about which files are being opened.

Now, this is the output when opening the UTM17 vrt file, looking at it fully, it uses the one overview found in the VRT file (which is good, expected behavior):

GDAL: GDALOpen(/opt/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BlueTopo_Fetched_UTM17.vrt, this=0x7f8ac55a0f50) succeeds as VRT.
GDAL: Computing area of interest: -84.3191, 22.7715, -77.681, 33.5922
GDAL: GDALDefaultOverviews::OverviewScan()
GDAL: GDALOpen(/opt/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BlueTopo_Fetched_UTM17.vrt.ovr, this=0x7f8ac4562ef0) succeeds as GTiff.
GTiff: ScanDirectories()
GTiff: Opened 2407x4690 overview.
GDAL: GDALDefaultOverviews::OverviewScan()
GDAL: GDAL_CACHEMAX = 1599 MB

The overview is perhaps a bit big, and it would not hurt to have a few more, but still, it's opening a single file, not too big, and performance is good for a few zoom-ins, for as long as that one overview is sufficient.

Eventually one reaches the point when that one is not enough anymore, and in a single zoom-in action, the output is the following:

GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BC26426N/BC26426N_complete.vrt, this=0x7f8af94b2110) succeeds as VRT.
GDAL: GDALDefaultOverviews::OverviewScan()
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BC26426N/BC26426N_complete.vrt.ovr, this=0x7f8afb704990) succeeds as GTiff.
GTiff: ScanDirectories()
GDAL: GDALDefaultOverviews::OverviewScan()
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BC26526M/BC26526M_complete.vrt, this=0x7f8afb7227c0) succeeds as VRT.
GDAL: GDALDefaultOverviews::OverviewScan()
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BC26526M/BC26526M_complete.vrt.ovr, this=0x7f8afb7db5f0) succeeds as GTiff.
GTiff: ScanDirectories()
GDAL: GDALDefaultOverviews::OverviewScan()
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BC26526N/BC26526N_complete.vrt, this=0x7f8afb90dd90) succeeds as VRT.
GDAL: GDALDefaultOverviews::OverviewScan()
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BC26526N/BC26526N_complete.vrt.ovr, this=0x7f8afb90e320) succeeds as GTiff.
GTiff: ScanDirectories()
GDAL: GDALDefaultOverviews::OverviewScan()
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BC26526N/BC26526N_complete.vrt, this=0x7f8b0c023d10) succeeds as VRT.
GDAL: GDALDefaultOverviews::OverviewScan()
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BC26526N/BC26526N_complete.vrt.ovr, this=0x7f8b0c9fb540) succeeds as GTiff.
GTiff: ScanDirectories()
GDAL: GDALDefaultOverviews::OverviewScan()
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BC26626N/BC26626N_complete.vrt, this=0x7f8b0ca04b70) succeeds as VRT.
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17/BlueTopo_BC26626N_20221122.tiff, this=0x7f8b0e149f80) succeeds as GTiff.
GTiff: GDAL_READDIR_LIMIT_ON_OPEN reached on /home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17
GTiff: ScanDirectories()
GTiff: Opened 3766x4231 overview.
GTiff: Opened 1883x2115 overview.
GTiff: Opened 941x1057 overview.
GTiff: Opened 470x528 overview.
GTiff: Opened 235x264 overview.
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BC26626N/BC26626N_complete.vrt, this=0x7f8ac8025f50) succeeds as VRT.
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17/BlueTopo_BC26626N_20221122.tiff, this=0x7f8ac8024150) succeeds as GTiff.
GTiff: GDAL_READDIR_LIMIT_ON_OPEN reached on /home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17
GTiff: ScanDirectories()
GTiff: Opened 3766x4231 overview.
GTiff: Opened 1883x2115 overview.
GTiff: Opened 941x1057 overview.
GTiff: Opened 470x528 overview.
GTiff: Opened 235x264 overview.
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BC26626M/BC26626M_complete.vrt, this=0x7f8ac8048e90) succeeds as VRT.
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17/BlueTopo_BC26626M_20221122.tiff, this=0x7f8ac8107ac0) succeeds as GTiff.
GTiff: GDAL_READDIR_LIMIT_ON_OPEN reached on /home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17
GTiff: ScanDirectories()
GTiff: Opened 3803x4228 overview.
GTiff: Opened 1901x2114 overview.
GTiff: Opened 950x1057 overview.
GTiff: Opened 475x528 overview.
GTiff: Opened 237x264 overview.
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BC26526L/BC26526L_complete.vrt, this=0x7f8ae87dadd0) succeeds as VRT.
GDAL: GDALDefaultOverviews::OverviewScan()
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BC26526L/BC26526L_complete.vrt.ovr, this=0x7f8ae8f832a0) succeeds as GTiff.
GTiff: ScanDirectories()
GDAL: GDALDefaultOverviews::OverviewScan()
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BC26426L/BC26426L_complete.vrt, this=0x7f8ae8fa1d10) succeeds as VRT.
GDAL: GDALDefaultOverviews::OverviewScan()
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BC26426L/BC26426L_complete.vrt.ovr, this=0x7f8ae9055470) succeeds as GTiff.
GTiff: ScanDirectories()
GDAL: GDALDefaultOverviews::OverviewScan()
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BC26526L/BC26526L_complete.vrt, this=0x7f8ac1f77c80) succeeds as VRT.
GDAL: GDALDefaultOverviews::OverviewScan()
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BC26526L/BC26526L_complete.vrt.ovr, this=0x7f8ac1f7a570) succeeds as GTiff.
GTiff: ScanDirectories()
GDAL: GDALDefaultOverviews::OverviewScan()
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BC26626L/BC26626L_complete.vrt, this=0x7f8ac28e6b80) succeeds as VRT.
GDAL: GDALDefaultOverviews::OverviewScan()
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BC26626L/BC26626L_complete.vrt.ovr, this=0x7f8ac2849410) succeeds as GTiff.
GTiff: ScanDirectories()
GDAL: GDALDefaultOverviews::OverviewScan()
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BC26626L/BC26626L_8m.vrt, this=0x7f8ac284a620) succeeds as VRT.
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17/BlueTopo_BF2HH2KC_20221125.tiff, this=0x7f8ac284f410) succeeds as GTiff.
GTiff: GDAL_READDIR_LIMIT_ON_OPEN reached on /home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17
GTiff: ScanDirectories()
GTiff: Opened 1913x2109 overview.
GTiff: Opened 956x1054 overview.
GTiff: Opened 478x527 overview.
GTiff: Opened 239x263 overview.
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17/BlueTopo_BF2HH2KD_20221125.tiff, this=0x7f8ac2997840) succeeds as GTiff.
GTiff: GDAL_READDIR_LIMIT_ON_OPEN reached on /home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17
GTiff: ScanDirectories()
GTiff: Opened 1908x2109 overview.
GTiff: Opened 954x1054 overview.
GTiff: Opened 477x527 overview.
GTiff: Opened 238x263 overview.
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17/BlueTopo_BF2HH2KB_20221125.tiff, this=0x7f8a182b5f60) succeeds as GTiff.
GTiff: GDAL_READDIR_LIMIT_ON_OPEN reached on /home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17
GTiff: ScanDirectories()
GTiff: Opened 1917x2109 overview.
GTiff: Opened 958x1054 overview.
GTiff: Opened 479x527 overview.
GTiff: Opened 239x263 overview.
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BC26626L/BC26626L_complete.vrt, this=0x7f8af0000cf0) succeeds as VRT.
GDAL: GDALDefaultOverviews::OverviewScan()
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BC26626L/BC26626L_complete.vrt.ovr, this=0x7f8af08024c0) succeeds as GTiff.
GTiff: ScanDirectories()
GDAL: GDALDefaultOverviews::OverviewScan()
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17/BlueTopo_BC26626L_20221122.tiff, this=0x7f8af07fb290) succeeds as GTiff.
GTiff: GDAL_READDIR_LIMIT_ON_OPEN reached on /home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17
GTiff: ScanDirectories()
GTiff: Opened 1968x4196 overview.
GTiff: Opened 984x2098 overview.
GTiff: Opened 492x1049 overview.
GTiff: Opened 246x524 overview.
GTiff: Opened 123x262 overview.
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo_VRT/BC26626L/BC26626L_8m.vrt, this=0x7f8af08297f0) succeeds as VRT.
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17/BlueTopo_BF2HH2KC_20221125.tiff, this=0x7f8af081e230) succeeds as GTiff.
GTiff: GDAL_READDIR_LIMIT_ON_OPEN reached on /home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17
GTiff: ScanDirectories()
GTiff: Opened 1913x2109 overview.
GTiff: Opened 956x1054 overview.
GTiff: Opened 478x527 overview.
GTiff: Opened 239x263 overview.
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17/BlueTopo_BF2HJ2KC_20221125.tiff, this=0x7f8af1477a50) succeeds as GTiff.
GTiff: GDAL_READDIR_LIMIT_ON_OPEN reached on /home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17
GTiff: ScanDirectories()
GTiff: Opened 1918x2113 overview.
GTiff: Opened 959x1056 overview.
GTiff: Opened 479x528 overview.
GTiff: Opened 239x264 overview.
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17/BlueTopo_BF2HH2KD_20221125.tiff, this=0x7f8af089cf20) succeeds as GTiff.
GTiff: GDAL_READDIR_LIMIT_ON_OPEN reached on /home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17
GTiff: ScanDirectories()
GTiff: Opened 1908x2109 overview.
GTiff: Opened 954x1054 overview.
GTiff: Opened 477x527 overview.
GTiff: Opened 238x263 overview.
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17/BlueTopo_BF2HJ2KD_20221125.tiff, this=0x7f8af21986b0) succeeds as GTiff.
GTiff: GDAL_READDIR_LIMIT_ON_OPEN reached on /home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17
GTiff: ScanDirectories()
GTiff: Opened 1913x2114 overview.
GTiff: Opened 956x1057 overview.
GTiff: Opened 478x528 overview.
GTiff: Opened 239x264 overview.
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17/BlueTopo_BF2HJ2KB_20221125.tiff, this=0x7f8af21a5bc0) succeeds as GTiff.
GTiff: GDAL_READDIR_LIMIT_ON_OPEN reached on /home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17
GTiff: ScanDirectories()
GTiff: Opened 1922x2113 overview.
GTiff: Opened 961x1056 overview.
GTiff: Opened 480x528 overview.
GTiff: Opened 240x264 overview.
GDAL: GDALOpen(/home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17/BlueTopo_BF2HH2KB_20221125.tiff, this=0x7f8af0edbf40) succeeds as GTiff.
GTiff: GDAL_READDIR_LIMIT_ON_OPEN reached on /home/aaime/devel/gisData/noaa/bluetopo/bluetopo_1027/bluetopo_tiles/BlueTopo/UTM17
GTiff: ScanDirectories()
GTiff: Opened 1917x2109 overview.
GTiff: Opened 958x1054 overview.
GTiff: Opened 479x527 overview.
GTiff: Opened 239x263 overview.

So it's opening the "complete" files (expected), the 8m files for directories having them and also the original files as well. Ideally, I'd expect only the "complete" overview files to be opened, and fall on lower levels of the VRT pyramid only later.

This made me wonder about the structure of the overviews, and checking a few examples, I've come up with this table (follow the link for the google spreadsheet, worth double checking against errors and misinterpretations of mine):

image

It is just me, or the overviews could be computed in a different way, to ensure a better balance between data duplication, and the number of files that need to be opened, to satisfy a give request? Thinking out loud, perhaps this, or some variations of it (e.g., having two external overview levels in complete could also be useful):

image

I'm also wondering how the "OverviewList" attribute, present only in some VRT files, can play a role in reading efficiencies.