Closed asinghvi17 closed 11 months ago
As it is now, here's the performance of geod_path
on my machine:
julia> @time Proj.geod_path(geod, 40.64, -73.78, 1.36, 103.99, 100);
0.000034 seconds (3 allocations: 2.281 KiB)
julia> @time Proj.geod_path(geod, 40.64, -73.78, 1.36, 103.99, 1000);
0.000240 seconds (3 allocations: 16.406 KiB)
julia> @time Proj.geod_path(geod, 40.64, -73.78, 1.36, 103.99, 10000);
0.001413 seconds (5 allocations: 156.875 KiB)
julia> @time Proj.geod_path(geod, 40.64, -73.78, 1.36, 103.99, 100000);
0.021385 seconds (5 allocations: 1.526 MiB)
julia> @time Proj.geod_path(geod, 40.64, -73.78, 1.36, 103.99, 1000000);
0.188391 seconds (1.11 k allocations: 15.310 MiB, 14.53% gc time, 11.07% compilation time)
julia> @time Proj.geod_path(geod, 40.64, -73.78, 1.36, 103.99, 1000000);
0.149690 seconds (5 allocations: 15.259 MiB)
Is it worth trying to optimize in order to not allocate so many Ref
s, by creating another dispatch for AbstractVectors?
As for optimisation, I've been removing excessive Ref generation elsewhere, in ArchGDAL it can be the main bottleneck currently in some operations (need to push that PR).
So +1 on removing them from the start.
As for optimisation, I've been removing excessive Ref generation elsewhere
@rafaqz can you explain this, why is it slow and how to avoid it? I see here in the PR immutable structs are made mutable, and pointer_from_objref
is used instead of keeping it mutable and using Ref. According to this thread pointer_from_objref
should be rarely used, and Ref is the way to go. The GC preserve is done by ccall so that isn't needed. It seems that using pointer_from_objref
is also more vulnerable to GC issues, from the docstring, with possible issues like this.
Should I override the functions here, or create a new set of functions? If so, how should we name them?
So far the existing high-level API, like in coord.jl, is much more Julian and distinct from the libproj function names. If we can come up with a nice high-level API then perhaps that'd be best I think. Then Proj.proj_*
stays just the low level auto generated interface.
Keep in mind by the way that libproj.jl is auto generated, so if you want to change it that has to be done via gen/.
What I mean is essentially what @jw3126 is doing in libgeos for isequals performance - define Ref
outside the hot paths and reuse them to minimise allocation. Not completely avoiding Ref
. That's what I thought @asinghvi17 meant by dispatch on Vectors - defining the Ref
s further out so they can be reused.
In ArchGDAL there are some hot paths where 3 refs are defined just to get one point. That allocation ends up totally dominating the profile when you get the points from a polygon. We need to be careful that all operations on the points of composite objects define the Refs
themselves and pass them in. Doing this makes rasterizing ArchGDAL polygons 20x faster. Just loading the polygons is still slower than the actual rasterisation, but not 20x slower.
I didn't mean to use pointers instead. Ref
is fine as long as we carefully minimise the allocations.
But in some cases it might be worth using pointers? pointers and mutable state in structs are two things I prefer not to think about, so it would need a clear reason and benchmarks.
Thanks, yeah that makes sense to me.
Looking at this again, it would be good if instead of lon and lat values as arguments, functions accepted GeoInterface.jl compatible points. That means (lon, lat)
would work, but other point types will too.
That would probably be a pretty easy overload, but I don't think Proj depends on Geointerface yet (not sure why). I'd also like to add some utilities to generically transform geometries in that case...
I just wrote it ;)
Sorry, my finger slipped on mobile. This is still a draft but I hope to polish it up today.
btw, here are some cool applications of this: https://xkdr.github.io/Karmana.jl/dev/examples/geodesic/ https://xkdr.github.io/Karmana.jl/dev/examples/annular_ring/#:~:text=an%20encore%2C%20let%27s-,animate,-what%20the%20annular_ring
(these are from the documentation of a utility package for my current org, it probably won't be registered but the code is open source.)
Do we need more features here or should we go ahead and merge once CI passes?
I've made the portion of code that I actually use as efficient as possible, but have no idea what other people use. As those usecases shake out we could gradually make them more efficient.
Nightly is failing because of a LibCURL version mismatch
@asinghvi17 thanks for taking this on! I'm not really familiar with the geodesics API, but overall this looks good to me. I pushed some minor nonfunctional commits.
We don't have to cover the whole API at once and can add what we need. With this I think we can close #66. For me this is good to merge.
@asinghvi17 @visr should we merge this?
If @visr approved back in May, we can merge this. We can always hold of tagging a release if something comes up.
This PR wraps the Proj
geod_*
API in a Julian API. The idea is that the user should be able to pass objects directly instead of by passing pointers as you have to do now. It also simplifies some constructors.With the new API (example copied from the docs of
geod_position
),and the last part will be further simplified into a
geod_path
function.Should I override the functions here, or create a new set of functions? If so, how should we name them?