AD4GD / pilot-2

2 stars 1 forks source link

Enlarged extent of enriched output raster dataset #5

Open vwvbrand opened 3 days ago

vwvbrand commented 3 days ago

OSM features covered by the bounding box of the input raster dataset can extend outside of this meaningful values (to NoData values of original raster dataset. Therefore, temporary rasterised files with OSM features have larger extent, and final enriched LULC dataset has the same ('hairy tiles'). Only features from OSM are written to the enriched LULC dataset in this area of bounding box, other values are marked as NoData.

Faced with the UKCEH dataset for bounding box: 347225.0000,452300.0000,343800.0000,540325.0000 [EPSG:27700]

Capture
vwvbrand commented 3 days ago

It can be fixed by:

  1. masking all pixels with NoData value from input dataset
  2. assigning all pixels of output raster in mask with NoData value

The code should look similar to this (needs to be aligned with current Notebook):

# open input LULC dataset
input_path = 'input_path'
in_ds = gdal.Open(input_path, gdal.GA_ReadOnly)
in_ds_band = in_ds.GetRasterBand(1)
in_data = in_ds_band.ReadAsArray()
# open output LULC dataset
output_path = 'output_path'
out_ds = gdal.Open(output_path, gdal.GA_Update)  # allow modifications
out_ds_band = out_ds.GetRasterBand(1)
out_data = out_ds_band.ReadAsArray()

# get no data value from input dataset
nodata_value = in_ds_band.GetNoDataValue()
# apply no data value from input dataset to output dataset
mask = (in_data == nodata_value)
out_data[mask] = nodata_value
# write modified data back to output raster
out_ds_band.WriteArray(out_data)
# set the same nodata value for output raster
out_ds_band.SetNoDataValue(nodata_value)

# flush cache to write data to disk
out_ds_band.FlushCache()
out_ds.FlushCache()
# close datasets
in_ds = None
out_ds = None