mattijn / topojson

Encode spatial data as topology in Python! 🌍 https://mattijn.github.io/topojson
BSD 3-Clause "New" or "Revised" License
178 stars 27 forks source link

Feature: Use geojson feature/geometry id if exists #219

Closed Mr-Ixolate closed 4 months ago

Mr-Ixolate commented 5 months ago

I've been using geojson to extract some space data from another format which already has a unique id for each shape that I am extracting. I use that id as the id for the feature and combine those into a feature collection.

However, when I use the Topology and .to_json() the output has replaced the id already provided with a generic "0" or "feature_00".

Ideally it would retain the ids.

Example script

from geojson import Feature, Polygon, FeatureCollection
import topojson as tp

feature = Feature(
    id = "myID",
    geometry = Polygon(
        coordinates=[[[0, 0],[0, 5],[5, 5],[5, 0],[0, 0]]],
        validate=True)
)

fc = FeatureCollection(features=[feature])
print(feature,end="\n\n")

topo = tp.Topology(data=fc, prequantize=True)
print(topo.to_json())
>>>{"geometry": {"coordinates": [[[0, 0], [0, 5], [5, 5], [5, 0], [0, 0]]], "type": "Polygon"}, "id": "myID", "properties": {}, "type": "Feature"}

>>>{"type":"Topology","objects":{"data":{"geometries":[{"properties":{},"type":"Polygon","arcs":[[0]],"id":"feature_0"}],"type":"GeometryCollection"}},"bbox":[0.0,0.0,5.0,5.0],"transform":{"scale":[5.000050000500005e-05,5.000050000500005e-05],"translate":[0.0,0.0]},"arcs":[[[0,0],[0,99999],[99999,0],[0,-99999],[-99999,0]]]}

Code

Haven't dived too deep yet, but the id in the topojson feature looks like it comes from the key in the data dict. https://github.com/mattijn/topojson/blob/5af019cb409859a01eda2964b901c996a4d6eb8f/topojson/core/extract.py#L463-L465

There is probably a better way of doing this but I changed the line to this and it seemed to work.

        data[
            feature.get("id") if feature.get("id") else "feature_{}".format(str(idx).zfill(zfill_value))
        ] = feature_dict

Will have a look at forking in a bit.

mattijn commented 5 months ago

Thanks for raising the issue! It is a good request and this is already happening if it is a geodataframe, see around here: https://github.com/mattijn/topojson/blob/main/topojson/core/extract.py#L533.

Just make sure that you only do this if the index of the full collection is unique. If you want to do a PR, which will be much appreciated!, you can do this with a simple if/else condition.

Hope that helps! And thanks again for raising the issue!

Mr-Ixolate commented 5 months ago

What would be the expectation if duplicate ids were found?

  1. Raise an error, possibly with the duplicate ids and their indexesisted.
  2. Fallback to the default id naming convention (feature_{idx}
  3. Check each id and append idx to the duplicate ids or something

Edit: shouldn't have tried to type this on mobile

mattijn commented 5 months ago

Maybe best using a new parameter in the Topology() class?

Like: ignore_index: bool, default False

And then do your suggested option 1 if False (default) and if True your option 2?

mattijn commented 4 months ago

Released as of v1.9