rafaqz / Rasters.jl

Raster manipulation for the Julia language
MIT License
213 stars 36 forks source link

Ways to improve spherical `cellarea` #779

Open asinghvi17 opened 3 weeks ago

asinghvi17 commented 3 weeks ago

768 already does a lot! But there are still a few low hanging fruit.

For reference,

# setup
using Rasters, Proj, Chairmarks #ArchGDAL, Chairmarks
using Rasters.Lookups
dX = 0.0006
dY = -0.0003
lon = X(Projected(166.0:dX:167.0; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dX), crs=EPSG(4326)))
lat = Y(Projected(-78.0:dY:-79.0; sampling=Intervals(Start()), order=ForwardOrdered(), span=Regular(dY), crs=EPSG(4326)))
ras = Raster(rand(lon, lat))
ras2 = Rasters.reproject(ras; crs = Rasters.EPSG(3857))

# benchmarks
julia> @be Rasters.cellarea(Planar(), ras2) seconds=3
Benchmark: 634 samples with 1 evaluation
 min    3.289 ms (18 allocs: 42.518 MiB)
 median 3.646 ms (18 allocs: 42.518 MiB)
 mean   4.695 ms (18 allocs: 42.518 MiB, 7.53% gc time)
 max    185.097 ms (18 allocs: 42.518 MiB, 96.16% gc time)

julia> @be Rasters.cellarea(Spherical(), ras2) seconds=30
Benchmark: 3 samples with 1 evaluation
        12.187 s (161175590 allocs: 6.004 GiB, 5.18% gc time)
        12.258 s (161175590 allocs: 6.004 GiB, 5.32% gc time)
        12.640 s (161175590 allocs: 6.004 GiB, 7.09% gc time)
rafaqz commented 3 weeks ago

With tiled batches we can just keep filling the same small ector of points that we pass to Proj to tranform. We can swap over resample at the same time. Its just historical that GDAL was already available as a hard dep so I used it. But its not anymore, and we should switch to Proj.

(also MWE or at least the size of ras2 would help your benchmark)

asinghvi17 commented 3 weeks ago

Added MWE code. General tiled iteration would also be nice to have, maybe from DD - this way you can iterate over the most optimal pattern for access as well, which could potentially play well with Dagger. I'm sure DD already does this but it would be nice to have it more exposed.

rafaqz commented 3 weeks ago

Actually we just let DiskArrays handle that currently

tiemvanderdeure commented 3 weeks ago

The reason I wrote it so that every point is transformed 4 times is that cell boundaries don't necessarily match. We can write a more optimized version but we need to keep something as stupid as the current implementation around for those cases.

rafaqz commented 3 weeks ago

Ah right we need a version for Irregular and Regular

tiemvanderdeure commented 3 weeks ago

Ah right we need a version for Irregular and Regular

Basically, though technically it could be possible to have overlap or gaps even if a dimension is Regular, right?

rafaqz commented 3 weeks ago

No regular is by definition gapless, there is no way to represent gaps. Actually only Explicit lookups have gaps and not Irregular, Irregular cant represent gaps either

tiemvanderdeure commented 3 weeks ago

Switching ArchGDAL out with Proj and using tuples instead of vectors already is a bit of an improvement:

julia> @be Rasters.cellarea(Spherical(), ras2) seconds=30
Benchmark: 4 samples with 1 evaluation
        8.519 s (49 allocs: 42.481 MiB)
        8.876 s (61 allocs: 42.481 MiB, 0.26% gc time)
        9.016 s (49 allocs: 42.481 MiB)
        9.396 s (56 allocs: 42.481 MiB, 0.20% gc time)

Profview shows that roughly 50% of the time is spent on the transformation and 50% on converting lonlat to cartesian coordinates.

tiemvanderdeure commented 3 weeks ago

I also found out that proj can convert directly to cartesian coordinates, though it seems to be much slower than converting to lat-lon and then to cartesian in Julia. https://proj.org/en/9.3/operations/conversions/cart.html

asinghvi17 commented 3 weeks ago

Proj will likely be less accurate than Julia for that transformation, FYI.

tiemvanderdeure commented 3 weeks ago

It was also slower, so the way it was before was probably fine.

I think switching from GDAL to Proj is easy enough. It's silly to transform every point 4 times but can't think of an elegant end generic solution to do the tiled iteration right now. I guess both of you have implemented that kind of thing a hundred times?

tiemvanderdeure commented 3 weeks ago

Also, this is not the kind of function that would end up inside an inner loop. I'm not sure how necessary it is to spend a lot of effort on making it fast.

rafaqz commented 3 weeks ago

Its probably already the fastest that exists by an order of magnitude, so I'm happy to call it off wherever. But generally switching from GDAL to Proj for these things is better for a bunch of reasons, not just speed.

asinghvi17 commented 3 weeks ago

If we're breaking, then we may as well break everything, and switch both reproject and cellarea to Proj instead of ArchGDAL? Since none of them need gdalwarp. Maybe also mapped crs things.

asinghvi17 commented 3 weeks ago

A generic tiled iterator would be a bit tough. I think you'd indicate an approximate size, and the iterator could partition into tiles that fit the chunk pattern most optimally. But that's a bigger change...I think even switching to Proj will give us enough speedup for my purposes.

asinghvi17 commented 3 weeks ago

~Ah crap, converting cellarea to Proj means we have to implement some form of GFT conversion in Proj as well...~

Edit: Proj has both GFT conversions and an inbuilt geographic check. So we could just switch.

asinghvi17 commented 3 weeks ago

https://github.com/meggart/DiskArrayEngine.jl/tree/main kind of implements tiled iteration, but it's also huge. I don't think it offers multithreading by default either, and it pulls in a crazy number of dependencies.

We could talk to Fabian about maybe splitting parts of it out into a DiskArrayEngineBase.jl package? But it's a bit complicated. Probably easier to write a lightweight wrapper on our own and maybe take some concepts / struct from DiskArrayEngine...

lazarusA commented 3 weeks ago

I believe is threaded and distributed by default, see the runner.

asinghvi17 commented 2 weeks ago

I just realized that Proj allows you to return data in radians (which is what it uses internally). We could potentially switch to that and avoid cosd and sind. Might speed up projection speed by a hair as well.

The switch is +units=r in a Proj-string, but if it's safer we can always use a const RADIAN_WGS84 = GFT.ESRIWellKnownText(...).

rafaqz commented 2 weeks ago

Good idea, minimise conversions where possible