Open stuckyb opened 2 years ago
As a quick investigation of an alternative merge strategy, I implemented first merging sub-groups of 4 tiles and then merging those subgroups. The 4-tile merges completed in less than a second, but the final merge still took prohibitively long.
Tested on local PC (Windows 10, python 3.9.15). >14 Tiles worked fine.
Used a small set of SRTM tile files (432 files) in local_data/srtm folder. (There are 71385 files in Ceres local_data/srtm folder)
the piece of code for time measuring in tileset.py
start_time = time.time()
fpaths = self.getTilePaths(subset_geom)
tiles = []
for fpath in fpaths:
tiles.append(open_rasterio(fpath, masked=True))
if len(tiles) > 0 and not(isinstance(tiles[0], xarray.DataArray)):
raise TypeError(
f'Expected xarray.DataArray; instead got {type(tiles[0])}.'
)
inter_merged = []
i = 0
while i < len(tiles):
j = i + 4
if j > len(tiles):
j = len(tiles)
inter_merged.append(merge.merge_arrays(tiles[i:j]))
i += 4
mosaic = merge.merge_arrays(inter_merged)
t = time.time() - start_time
n = len(tiles)
results of several clips of the region around New Mexico area. (-105,33),(-105,33) t =1.1 seconds, n = 4 (-106,33),(-105,35) t =4.3 seconds, n = 12 (-107,33),(-105,35) t =5.3 seconds, n = 16 (-108,33),(-105,35) t =8.7 seconds, n = 20 (-109,33),(-105,37) t =15.8 seconds, n = 36 (-108,31),(-103,36) t =25.5 seconds, n = 49
Installed Conda Python 3.9 environment on Windows WSL Ubuntu 20.04 LTS, did the same test, results are similar to above.
Installed Conda Python 3.9 environment on ArchLinux 2023.01.01 on a very old desktop (4G memory), did the same test (4, 12, 16, 20, 36 tiles), results are similar to above. Stuck on 49 tiles.
Tried to build a singularity image on Windows WSL Ubuntu by,
sudo singularity build geocdl.sif geocdl.def
singularity build --fakeroot geocdl.sif geocdl.def
both got same error:
INFO: Starting build...
Getting image source signatures
Copying blob 677076032cca skipped: already exists
Copying config 58db3edaf2 done
Writing manifest to image destination
Storing signatures
FATAL: While performing build: conveyor failed to get: unsupported image-specific operation on artifact with type "application/vnd.docker.container.image.v1+json"
When testing with SRTM DEM tiles, requests that require ~14 or fewer tiles return quickly, but requests that require much more than that just churn at the mosaic-building stage. The process is does not appear to be either CPU- nr memory-bound, so it's not clear to me what is going on. Regardless, there is clearly optimization work needed.