Closed alexamici closed 3 years ago
One issue with VRT and GDAL in general is that it looks like the GeoTransform
doesn't accept the UTC time-like coordinates (or maybe I didn't look for it long enough) that we use for azimuth_time
.
Also the name for the dimensions need to be encoded outside of the VRT.
Both issues can be worked around in a post-processing step:
def open_burst_dataset(...):
burst_ds = rioxarray.open_rasterio(vrt_path)
burst_ds.assign_coords(azimuth_time=("y", azimuth_time))
burst_ds = burst_ds.rename_dims(x=slant_range_time)
burst_ds.swap_dims(y=azimuth_time)
With GDAL creating an in-memory VRT for a burst should be as simple as:
gdal.Translate("/vsimem/unique_name.vrt",
input_tif,
format="VRT",
srcWin=(sample_min, line_min, sample_max, line_max),
)
In this way, even if you close the dataset, you will still have the virtual file in memory, but I do not know how well GDAL and rasterio play together.
After reviewing the details with @aurghs I think the limitation of the VRT approach identified above are a killer.
Much simpler to just:
def open_burst_dataset(...):
# compute burst properties
swath_ds = rioxarray.open_rasterio(swath_path)
burst_ds = swath_ds.isel(y=slice(linesPerBurst * burst_count, linesPerBurst * (burst_count + 1))
# attach_coords
# swap_dims
return burst_ds
This is lazy and plays well with dask.
The low level access to the TIFF data is as efficient as with GDAL. That is, it may depend on the TIFF file (compression, tiles, etc).
The lowest level in exposing the data is to create bursts datasets with the data contained in the measurements TIFF files (real and imaginary part scaled to
int16
aa "distribution" digital number, not the original DN).The step needed are:
(line_min, sample_min, line_max, sample_max)
xr.Dataset
Step 3. can be done quickly by creating a VRT on disk and opening it with rioxarray.
Longer term we wan to support products on read-only file-systems or zip files, but there is no way (that I found) to pass the VRT XML to GDAL without saving it first to a file.