JuliaGeo / GeoJSON.jl

Utilities for working with GeoJSON data in Julia
https://juliageo.org/GeoJSON.jl/stable/
MIT License
68 stars 10 forks source link

Use StaticArrays based concrete types #23

Closed visr closed 2 years ago

visr commented 4 years ago

Just opening this as a draft PR to show what I was working on before making https://github.com/visr/GeoJSONTables.jl. Not sure yet if the GeoJSONTables approach can be made to cover all of GeoJSON use cases. But if it can, perhaps we should close this PR and switch GeoJSON.jl over to the GeoJSONTables approach. Good to discuss :)

At first I wanted to just use GeometryBasics as is, though it is also convenient to own your own geometry structs such that you can align them perfectly to the GeoJSON spec. So then I based the structs here on those in RoamesGeometry.jl and customized a bit. Not quite happy with it yet, but it does do reading and writing, and fixes #21.

From Natural Earth data I got ne_10m_coastline and converted it to a 12 MB FeatureCollection with 4133 Features of LineString type geomery. A nice set to do some quick benchmarking:

@btime GeoJSON.read(read("ne_10m_coastline.json"))
# 1.263 s (8384107 allocations: 488.49 MiB)
@btime JSON3.read(read("ne_10m_coastline.json"))
# 57.925 ms (23 allocations: 11.89 MiB)
@btime GeoJSONTables.read(read("ne_10m_coastline.json"))
# 65.961 ms (34 allocations: 11.89 MiB)

JSON3 is very fast, due to its semi-lazy parsing. This branch kind of ruins that performace by allocating all the objects we care about along the way. Should say I did not look much into optimizing it. GeoJSONTables adds only very little overhead on top of JSON3, as expected, since we keep the geometry and properties objects in the semi-lazy parsed state.

For comparison, the results with JSON.jl and the last GeoJSON.jl release, v0.4:

@btime JSON.parsefile("ne_10m_coastline.json")
# 267.712 ms (2594230 allocations: 105.56 MiB)
@btime GeoJSON.parsefile("ne_10m_coastline.json")
# 465.260 ms (3239226 allocations: 150.35 MiB)

So the old JSON parser was definitely slower, but GeoJSON.parsefile was still twice as fast as this branch.

visr commented 4 years ago

For confused readers, I'll keep this open for now such that people can find it easily, but I don't think we should merge it. It may be a step forward in some ways, but IMO GeoJSONTables is a better step forwards.

We can think about integrating that code into this package, but it would probably be best to do that after it stabilizes, and has integrated a renewed GeoInterface and been used to explore the GeoTables idea. That way we can avoid breaking this package too often.

visr commented 2 years ago

Resolved by #42.