rafaqz / Rasters.jl

Raster manipulation for the Julia language
MIT License
212 stars 36 forks source link

Read raster into memory automatically for copy and deepcopy? #224

Open vlandau opened 2 years ago

vlandau commented 2 years ago

This code doesn't work

using Rasters
url_base = "https://raw.githubusercontent.com/Circuitscape/datasets/main/"

download(string(url_base, "data/nlcd_2016_frederick_md.tif"),
         "local/nlcd_2016_frederick_md.tif")

landscape = Raster("local/nlcd_2016_frederick_md.tif", missingval = -9999)

the_copy = deepcopy(landscape)
the_copy[1,1,1] = 0x05

Ouput:

julia> the_copy[1,1,1] = 0x05
ERROR: LoadError: StackOverflowError:
Stacktrace:
  [1] Array
    @ ./boot.jl:452 [inlined]
  [2] Array
    @ ./boot.jl:459 [inlined]
  [3] similar
    @ ./array.jl:359 [inlined]
  [4] writeblock!(::DiskArrays.BroadcastDiskArray{Float64, 3, Base.Broadcast.Broadcasted{DiskArrays.ChunkStyle{3}, Nothing, typeof(convert), Tuple{Base.RefValue{Type{Float64}}, Rasters.FileArray{Rasters.GDALfile, UInt8, 3, Nothing, DiskArrays.GridChunks{3}, DiskArrays.Unchunked}}}}, ::Array{Int64, 3}, ::UnitRange{Int64}, ::UnitRange{Int64}, ::UnitRange{Int64})
    @ DiskArrays ~/.julia/packages/DiskArrays/NTqYs/src/DiskArrays.jl:46
  [5] writeblock!(::DiskArrays.BroadcastDiskArray{Float64, 3, Base.Broadcast.Broadcasted{DiskArrays.ChunkStyle{3}, Nothing, typeof(convert), Tuple{Base.RefValue{Type{Float64}}, Rasters.FileArray{Rasters.GDALfile, UInt8, 3, Nothing, DiskArrays.GridChunks{3}, DiskArrays.Unchunked}}}}, ::Array{Int64, 3}, ::UnitRange{Int64}, ::UnitRange{Int64}, ::UnitRange{Int64}) (repeats 8097 times)
    @ DiskArrays ~/.julia/packages/DiskArrays/NTqYs/src/DiskArrays.jl:48
  [6] setindex_disk!(::DiskArrays.BroadcastDiskArray{Float64, 3, Base.Broadcast.Broadcasted{DiskArrays.ChunkStyle{3}, Nothing, typeof(convert), Tuple{Base.RefValue{Type{Float64}}, Rasters.FileArray{Rasters.GDALfile, UInt8, 3, Nothing, DiskArrays.GridChunks{3}, DiskArrays.Unchunked}}}}, ::Array{Int64, 3}, ::Int64, ::Vararg{Int64, N} where N)
    @ DiskArrays ~/.julia/packages/DiskArrays/NTqYs/src/DiskArrays.jl:70
  [7] setindex!
    @ ~/.julia/packages/DiskArrays/NTqYs/src/DiskArrays.jl:202 [inlined]
  [8] setindex!(::DiskArrays.BroadcastDiskArray{Float64, 3, Base.Broadcast.Broadcasted{DiskArrays.ChunkStyle{3}, Nothing, typeof(convert), Tuple{Base.RefValue{Type{Float64}}, Rasters.FileArray{Rasters.GDALfile, UInt8, 3, Nothing, DiskArrays.GridChunks{3}, DiskArrays.Unchunked}}}}, ::Int64, ::Int64, ::Int64, ::Int64)
    @ DiskArrays ~/.julia/packages/DiskArrays/NTqYs/src/DiskArrays.jl:206
  [9] setindex!(::Raster{Float64, 3, Tuple{X{Projected{Float64, LinRange{Float64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Float64}, DimensionalData.Dimensions.LookupArrays.Intervals{DimensionalData.Dimensions.LookupArrays.Start}, DimensionalData.Dimensions.LookupArrays.Metadata{Rasters.GDALfile, Dict{Any, Any}}, WellKnownText{GeoFormatTypes.CRS, String}, Nothing, X{Colon}}}, Y{Projected{Float64, LinRange{Float64}, DimensionalData.Dimensions.LookupArrays.ReverseOrdered, DimensionalData.Dimensions.LookupArrays.Regular{Float64}, DimensionalData.Dimensions.LookupArrays.Intervals{DimensionalData.Dimensions.LookupArrays.Start}, DimensionalData.Dimensions.LookupArrays.Metadata{Rasters.GDALfile, Dict{Any, Any}}, WellKnownText{GeoFormatTypes.CRS, String}, Nothing, Y{Colon}}}, Band{DimensionalData.Dimensions.LookupArrays.Categorical{Int64, UnitRange{Int64}, DimensionalData.Dimensions.LookupArrays.ForwardOrdered, DimensionalData.Dimensions.LookupArrays.NoMetadata}}}, Tuple{}, DiskArrays.BroadcastDiskArray{Float64, 3, Base.Broadcast.Broadcasted{DiskArrays.ChunkStyle{3}, Nothing, typeof(convert), Tuple{Base.RefValue{Type{Float64}}, Rasters.FileArray{Rasters.GDALfile, UInt8, 3, Nothing, DiskArrays.GridChunks{3}, DiskArrays.Unchunked}}}}, Symbol, DimensionalData.Dimensions.LookupArrays.Metadata{Rasters.GDALfile, Dict{Symbol, Any}}, Int64}, ::Int64, ::Int64, ::Int64, ::Int64)
    @ Rasters ~/.julia/packages/Rasters/W22nT/src/array.jl:250

It looks like the problem is due to the Raster being entirely disk-backed as of the call of copy.

The code below does work

using Rasters

url_base = "https://raw.githubusercontent.com/Circuitscape/datasets/main/"

download(string(url_base, "data/nlcd_2016_frederick_md.tif"),
         "local/nlcd_2016_frederick_md.tif")

landscape = Raster("local/nlcd_2016_frederick_md.tif")[:,:,:] # <- Force raster to be read into memory

the_copy = deepcopy(landscape)
the_copy[1,1,1] = 0x05

To be clear, I recognize at this time this may be expected behavior, but it may be user-friendly to add new methods to copy and deepcopy that first read the raster into memory (if it isn't already), and then return a fully memory-backed copy, so the user can skip that step?

Of course, this is a major edge case, and therefore low priority. I can use the workaround above in the mean time!

Thanks for all your work to provide GIS functionality in Julia @rafaqz!

EDIT: Simplified code for clarity

rafaqz commented 2 years ago

Thanks, that makes sense.

So it's making a perfectly fine copy of the object, but the copy actually points at the same file. And that's kind of weird for how people think about cpoy/deepcopy. I guess copy and deepcopy should be the same here?

So maybe copy/deepcopy:

(btw you can use read to move to memory as you do with indexing)

We will have to go through and manually copy/deepcopy the dims/metadata etc fields if we override copy/deepcopy, and rebuild the object.

vlandau commented 2 years ago

Ah yes, looks like copy works just fine. I'm still slightly unsure about when to use copy vs deepcopy in general...

I think your proposal above makes sense.

rafaqz commented 2 years ago

I'm wondering if this needs a warning if the file is disk-based?

"Warning: this raster is disk based. If you mean to copy the file, use cp(src, dst). For this operation, raster read to memory with read.

Also, we could actually define cp:

cp(src::Raster, dst::AbstractString; kw...)` = `Raster(copy(filename(src), dst; kw...)`

And have a similar warning if it isn't disk-based.

vlandau commented 2 years ago

I think that makes sense -- could do a warning or even an error