Closed SimonDanisch closed 2 years ago
I'm running into a pretty big performance regression in the geo interface implementation:
using GeoJSON, GeoInterface, GeometryBasics
using Downloads: download
const trait_type_dict = Dict(
PolygonTrait() => Polygon,
MultiPolygonTrait() => MultiPolygon,
LineStringTrait() => LineString,
PointTrait() => Point,
)
# Convert all geoJSON objects to the appropriate GeometryBasics type based on trait
function geoJSONtraitParse(feature::GeoJSON.Feature)
geometry = GeoInterface.geometry(feature)
trait = GeoInterface.trait(geometry)
if !haskey(trait_type_dict, trait)
@warn "GeoMakie.geoJSONtraitParse: Unknown geometry type $(GeoInterface.trait(geometry))"
return nothing
end
GEOM = trait_type_dict[trait]
@show GEOM
return GeoInterface.convert(GEOM, geometry)
end
function geoJSONtraitParse(featureCollection::GeoJSON.FeatureCollection)
return [geoJSONtraitParse(f) for f in featureCollection]
end
countries_file = download("https://datahub.io/core/geo-countries/r/countries.geojson")
countries = GeoJSON.read(read(countries_file, String))
feature = first(countries)
geometry = GeoInterface.geometry(feature)
trait = GeoInterface.trait(geometry)
xx = @time GeoInterface.convert(GEOM, geometry)
This takes ~5s to compute for a simple 1500 point polygon. All of the countries take so long, that it seems like it's just frozen. I think we should resolve that first before merging. Also, if I see correctly, the abstract GeoInterface types are gone in favor of having only traits. This is pretty inconvenient, since now we can't just overload the Makie conversion pipeline to work with all packages using GeoInterface, which seems pretty bad considering the purpose of GeoInterface? That would bring us back to depend on any package implementing GeoInterface, to actually support plotting their types.
Does anyone have an idea how to bring back "real" support for any geo package? CC: @jeremiahpslewis @visr @evetion
For Plots.jl we added a macro to make plotting support relatively easy, in the GeoInterfacePlots.jl subpackage.
We just run the @enable_geo_plots MyAbstractGeomType
for the type you want to have plot recipes.
But yes I can see how it will be more annoying for Makie, this is the downside of using traits. But I'm not sure how we could enforce inheritance from a GeoInterface.jl abstract type, seeing that we want e.g. GeometryBasics types to work in the rest of the interface.
I can look at the performance regression.
Well, we could still have an optional abstract type to depend on? And packages that can afford to inherit from that, can live an easier life? :D
Yeah, I would also kind of like that too. I proposed keeping a default hierarchy at some stage but we ended up not for simplicity. But that doesn't seem to be working out in this case.
Probably also the packages that benefit most from plot recipes are the ones that can inherit from GeoInterface.jl types (GeometryBasics/Meshes etc wont need them).
@evetion @visr what do you think about adding at least AbstractGeometry
and possibly the whole tree back to GeoInterface.jl? It should simplify some of our code as well.
@rafaqz Here's an MWE which doesn't depend on GeoMakie. Feel like this should be filed as a GeoJSON performance issue? Or GeometryBasics?
Semi-related...would it make sense to include performance tests in GeoJSON.jl
to catch relevant regressions?
using GeoJSON
using Downloads
using GeoInterface
using GeometryBasics
countries = GeoJSON.read(read(Downloads.download("https://datahub.io/core/geo-countries/r/countries.geojson"), String))
@time GeoInterface.convert(MultiPolygon, GeoInterface.geometry(countries[3]));
julia> countries = GeoJSON.read(read(Downloads.download("https://datahub.io/core/geo-countries/r/countries.geojson"), String))
FeatureCollection with 255 Features
julia> @time GeoInterface.convert(MultiPolygon, GeoInterface.geometry(countries[3]));
7.940734 seconds (2.71 M allocations: 157.558 MiB, 1.77% gc time, 12.60% compilation time)
[cf35fbd7] GeoInterface v1.0.1
[61d90e0f] GeoJSON v0.6.1
[5c1252a2] GeometryBasics v0.4.4
It seems the implementation of GeoInterface convert in GeometryBasics is very slow, due to the nested conversion (MultiPolygon goes to Polygon, then LineString, then Point conversion), at each stage also requesting the subgeometry from GeoJSON. So that's a lot of conversion and I guess some type instabilities also occur in the meantime.
On the GeoJSON side things seem fast enough (here given with compile times included).
julia> @time GeoInterface.geometry(countries[3])
0.000019 seconds (12 allocations: 1.250 KiB)
MultiPolygon
julia> @time GeoInterface.coordinates(GeoInterface.geometry(countries[3]))
0.047426 seconds (208.72 k allocations: 11.183 MiB, 86.95% compilation time)
3-element Vector{JSON3.Array}:
JSON3.Array[JSON3.Array[[11.73751945100014, -16.692577982999836], [11.738506688000115, -16.70582207599999], [11.71579354500011, -16.685006365999968], [11.701973057000117, -16.65567360999991], [11.669394143000147, -16.557209578999945], [11.67136
julia> @time collect(GeoInterface.getgeom(GeoInterface.geometry(countries[3])))
0.030355 seconds (215.86 k allocations: 11.546 MiB, 99.29% compilation time)
3-element Vector{GeoJSON.Polygon{NamedTuple{(:type, :coordinates), Tuple{String, JSON3.Array{JSON3.Array, Base.CodeUnits{UInt8, String}, SubArray{UInt64, 1, Vector{UInt64}, Tuple{UnitRange{Int64}}, true}}}}}}:
Polygon
Polygon
Polygon
It's actually a bit of both, it seems, the JSON3.Array is also not helping.
Changing this in GeometryBasics nets you:
function GeoInterface.convert(::Type{LineString}, type::LineStringTrait, geom)
dim = Int(ncoord(geom))
- return LineString([Point{dim, Float64}(GeoInterface.coordinates(p)) for p in getgeom(geom)])
+ return LineString(Point{dim,Float64}.(copy.(GeoInterface.coordinates(geom))))
end
julia> @time GeoInterface.convert(MultiPolygon, MultiPolygonTrait(), GeoInterface.geometry(countries[3]));
0.016911 seconds (19.31 k allocations: 1004.312 KiB)
So we might want to see what adding copy
does to the GeoInterface implementation over at GeoJSON.
Great :) I'll merge and tag this for now, and then we can figure out the improvements in peace ;)
Beet me to it. But yes, I knew that lazy loading JSON3 had to have a downside.
So GeoJSON.jl needs some specialised convert methods, and we need to see where materialising the JSON3 string makes more sense than keeping it lazy.
Fixes https://github.com/MakieOrg/GeoMakie.jl/pull/123