rafaqz / Rasters.jl

Raster manipulation for the Julia language
MIT License
214 stars 36 forks source link

[low priority] CategoricalArrays support #761

Open asinghvi17 opened 1 month ago

asinghvi17 commented 1 month ago

Right now, some in memory operations work with a Raster that wraps a CategoricalArray.

Here's an mwe:

using Rasters, CategoricalArrays

grain_order = ["clay", "silt", "sand"]
grain_char = rand(grain_order, 6, 6)
grain_fact = CategoricalArray(grain_char, levels = grain_order)

# wrap the categorical array in a Raster object
grain = Raster(grain_fact, (X(LinRange(-1.5, 1.5, 6)), Y(LinRange(-1.5, 1.5, 6))))

This works fine, I'll update the issue to check if things like zonal work as intended.

Writing does not work:

using ArchGDAL
write(grain, "grain.tif")

ERROR: MethodError: Cannot `convert` an object of type Type{CategoricalValue{String, UInt32}} to an object of type ArchGDAL.GDALDataType

Closest candidates are:
  convert(::Type{ArchGDAL.GDALDataType}, ::Type{UInt32})
   @ ArchGDAL ~/.julia/packages/ArchGDAL/xZUPv/src/utils.jl:142
  convert(::Type{ArchGDAL.GDALDataType}, ::Type{UInt16})
   @ ArchGDAL ~/.julia/packages/ArchGDAL/xZUPv/src/utils.jl:142
  convert(::Type{ArchGDAL.GDALDataType}, ::Type{UInt8})
   @ ArchGDAL ~/.julia/packages/ArchGDAL/xZUPv/src/utils.jl:142
  ...

Stacktrace:
  [1] unsafe_create(filename::String; driver::ArchGDAL.Driver, width::Int64, height::Int64, nbands::Int64, dtype::DataType, options::Vector{…})
    @ ArchGDAL ~/.julia/packages/ArchGDAL/xZUPv/src/dataset.jl:407
  [2] unsafe_create
    @ ~/.julia/packages/ArchGDAL/xZUPv/src/dataset.jl:398 [inlined]
  [3] create(f::RastersArchGDALExt.var"#37#40"{…}, args::String; kwargs::@Kwargs{…})
    @ ArchGDAL ~/.julia/packages/ArchGDAL/xZUPv/src/context.jl:266
  [4] _create_with_driver(f::RastersArchGDALExt.var"#16#18"{…}, filename::String, dims::Tuple{…}, T::Type, missingval::Nothing; options::Dict{…}, driver::String, _block_template::Raster{…}, chunks::Rasters.NoKW, kw::@Kwargs{})
    @ RastersArchGDALExt ~/.julia/dev/Rasters/ext/RastersArchGDALExt/gdal_source.jl:375
  [5] _create_with_driver
    @ ~/.julia/dev/Rasters/ext/RastersArchGDALExt/gdal_source.jl:351 [inlined]
  [6] write(filename::String, ::Rasters.GDALsource, A::Raster{…}; force::Bool, verbose::Bool, missingval::Rasters.NoKW, kw::@Kwargs{})
    @ RastersArchGDALExt ~/.julia/dev/Rasters/ext/RastersArchGDALExt/gdal_source.jl:59
  [7] write
    @ ~/.julia/dev/Rasters/ext/RastersArchGDALExt/gdal_source.jl:50 [inlined]
  [8] write(filename::String, A::Raster{…}; source::Rasters.GDALsource, kw::@Kwargs{})
    @ Rasters ~/.julia/dev/Rasters/src/write.jl:78
  [9] write(filename::String, A::Raster{…})
    @ Rasters ~/.julia/dev/Rasters/src/write.jl:74
 [10] top-level scope
    @ REPL[52]:1
Some type information was truncated. Use `show(err)` to see complete types.

so maybe there need to be some more specific dispatches there.

asinghvi17 commented 1 month ago

Also, holy shit, polygonize just works :O

Screenshot 2024-09-23 at 7 03 09 PM
asinghvi17 commented 1 month ago

zonal works but polygonize does not roundtrip through it, not sure what's going on there.

julia> using StatsBase; Rasters.zonal(countmap, grain; of = GO.polygonize(grain))
3-element Vector{Union{Missing, Dict{CategoricalValue{String, UInt32}, Int64}}}:
 Dict{CategoricalValue{String, UInt32}, Int64}("sand" => 1, "clay" => 1)
 Dict{CategoricalValue{String, UInt32}, Int64}("sand" => 1)
 Dict{CategoricalValue{String, UInt32}, Int64}()
asinghvi17 commented 1 month ago

Aaaah I didn't realize DimensionalArrays has a CategoricalArrays extension! This makes much more sense now :D

rafaqz commented 1 month ago

Yeah maybe @tiemvanderdeure added that

rafaqz commented 1 month ago

What do you want write to do. Write strings? Or an Integer?

Ok seems a UInt8 or UInt16 if there are more than 256 are what to do.

The dbf table trick is kinda weird but we could write one in a RastersCategoricalArraysExt extension, and have a categorical keyword to force reading it?

Maybe netcdf can handle categories without that

tiemvanderdeure commented 1 month ago

Yes I added a small extensions that defines the basic operation of CategoricalArrays on a DimArray, which is why you can call categorical and a Raster and it will behave. I never got round to making a similar extension for Rasters though.

tiemvanderdeure commented 1 month ago

Another one to add to the list is to make missingval make sense for categorical rasters - we probably need some dispatch of _convert_missingval.