yeesian / ArchGDAL.jl

A high level API for GDAL - Geospatial Data Abstraction Library
https://yeesian.github.io/ArchGDAL.jl/stable/
Other
137 stars 25 forks source link

filter of an overview gives wrong answer #409

Open felixcremer opened 4 months ago

felixcremer commented 4 months ago

When I use the filter function on an overview of a band I still get some of the filtered values. These are not there , when I am first collecting the array and then do the filtering operation. This only happens with an overview that I get from AG.getoverview and not from a band directly. An example file is attached. I find it strange, that there seems to be the same amount of values which for me might mean, that the filtering and the accessing is using different code paths. I also see strange behaviour when plotting the data. The overview plotted directly seems to be randomly shifted.

julia> rastop = AG.readraster("pyramidmiddle.tif")
GDAL Dataset (Driver: GTiff/GeoTIFF)
File(s): 
  pyramidmiddle.tif

Dataset (width x height): 938 x 938 (pixels)
Number of raster bands: 1
  [GA_ReadOnly] Band 1 (Gray): 938 x 938 (Int16)

julia> b =AG.getband(rastop, 1)
[GA_ReadOnly] Band 1 (Gray): 938 x 938 (Int16)
    blocksize: 938×128, nodata: -32768.0, units: 1.0px + 0.0
    overviews: (0) 469x469 (1) 235x235 

julia> o = AG.getoverview(b,1)
[GA_ReadOnly] Band 1 (Gray): 235 x 235 (Int16)
    blocksize: 128×128, nodata: -32768.0, units: 1.0px + 0.0
    overviews: 

julia> filter(!(==(-9999)), o)
2824-element Vector{Int16}:
  -136
  -159
  -175
  -180
  -178
  -137
  -146
     ⋮
 -9999
 -9999
 -9999
 -9999
 -9999
 -9999

julia> oc = collect(o);

julia> filter(!(==(-9999)), oc)
2824-element Vector{Int16}:
 -136
 -159
 -175
 -180
 -178
 -137
 -146
    ⋮
 -167
 -161
 -156
 -160
 -151
 -157

pyramidmiddle.zip

rafaqz commented 4 months ago

Looks like a DiskArrays.jl problem to me (otherwise you couldn't use filter at all)

The problem is actually that map doesn't reorder itself so the bitmatrix is wrong:

julia> o[map(!(==(-9999)), o)] # This is what `filter` does underneath
2824-element Vector{Int16}:
  -136
  -159
  -175
  -180
  -178
  -137
  -146
     ⋮
 -9999
 -9999
 -9999
 -9999
 -9999
 -9999
 -9999

Where broadcast does work:

julia> o[broadcast(!(==(-9999)), o)]
2824-element Vector{Int16}:
 -136
 -159
 -175
 -180
 -178
 -137
 -146
    ⋮
 -164
 -167
 -161
 -156
 -160
 -151
 -157
rafaqz commented 4 months ago

So this is a DiskArrays.jl issue

rafaqz commented 4 months ago

Haha this is my fault we are missing collect_similar in the DiskGenerator implementation.

felixcremer commented 4 months ago

Very good. I am still confused, why this only happens on the overviews and not on the band. Because I just looked at the DiskGenerator that is constructed for both and they look very similar.

julia> go # Generator of the overview
DiskArrays.DiskGenerator{ArchGDAL.IRasterBand{Int16}, ComposedFunction{typeof(!), Base.Fix2{typeof(==), Int64}}}(!Base.Fix2{typeof(==), Int64}(==, -9999), [GA_ReadOnly] Band 1 (Gray): 235 x 235 (Int16)
    blocksize: 128×128, nodata: -32768.0, units: 1.0px + 0.0
    overviews: )

julia> gb # Generator of the band
DiskArrays.DiskGenerator{ArchGDAL.IRasterBand{Int16}, ComposedFunction{typeof(!), Base.Fix2{typeof(==), Int64}}}(!Base.Fix2{typeof(==), Int64}(==, -9999), [GA_ReadOnly] Band 1 (Gray): 938 x 938 (Int16)
    blocksize: 938×128, nodata: -32768.0, units: 1.0px + 0.0
    overviews: (0) 469x469 (1) 235x235 )
rafaqz commented 4 months ago

On the band the block looks like the whole axis (938?), so block iterate is the same as normal iterate

rafaqz commented 4 months ago

I think map should always collect first so we don't have this problem, and we should add filter for diskarrays that doesn't use map and is just not properly ordered, because we rarely care about order with filter?

felixcremer commented 4 months ago

Always collecting on map for a DiskArray sounds like very dangerous. I opened https://github.com/meggart/DiskArrays.jl/issues/144 so that we can discuss this further there.