OvertureMaps / data

Overture Maps Data
https://docs.overturemaps.org
955 stars 37 forks source link

Excess properties in theme=transportation/type=connector #174

Open dezhin opened 3 months ago

dezhin commented 3 months ago

It seems connector parquets have excess properties derived from segments such as speed_limits and road_surface that aren't declared in the schema. The schema says:

Apart from their point geometry and the core properties required for all Overture features, connectors do not have any other properties.

D .mode line
D SELECT * FROM read_parquet('/mnt/alpha/overture-2024-06-13-beta.1/source/theme=transportation/type=connector/part-00012-64477a71-ec09-43ea-945d-0837bdaed32b-c000.zstd.parquet', filename=true) LIMIT 1;
                    id = 08f694ec06350c4a043beff19c47f8d3
              geometry = \x00\x00\x00\x00\x01@^?\x93i\xD4\xAF\xF0@-(\xE9\xBD\x9E[\xA0
                  bbox = {'xmin': 120.99337, 'xmax': 120.993385, 'ymin': 14.579906, 'ymax': 14.579908}
               version = 0
           update_time = 2024-06-20T18:04:23Z
               sources = [{'property': , 'dataset': OpenStreetMap, 'record_id': NULL, 'confidence': NULL}]
               subtype = 
                 names = 
                 class = 
         connector_ids = 
   access_restrictions = 
           level_rules = 
prohibited_transitions = 
          road_surface = 
            road_flags = 
          speed_limits = 
           width_rules = 
                  road = 
              filename = /mnt/alpha/overture-2024-06-13-beta.1/source/theme=transportation/type=connector/part-00012-64477a71-ec09-43ea-945d-0837bdaed32b-c000.zstd.parquet
                 theme = transportation
                  type = connector
dezhin commented 3 months ago

And why xmin / xmax and ymin / ymax are different for points? Is it by intention to prevent floating point errors?

jwass commented 3 months ago

Hi @dezhin. You're right. This is an issue on our end and the connector parquet files shouldn't have those fields that get accidentally pulled in from the segments. Since the values for those fields are all null for connectors (or should be), they can be safely ignored. We'll fix this in the next release but probably not worth pushing out a patch release.

jwass commented 3 months ago

And why xmin / xmax and ymin / ymax are different for points? Is it by intention to prevent floating point errors?

This happens because the native point coordinates are doubles, but xmin, xmax, etc are float32s. So the min values have to be the next truncated float "down" from the coordinate's double value to ensure they are less than or equal to the underlying coordinate. And max has to be the next "up" to ensure it's greater than or equal to the value. The result is this weird effect where min and max are different for points.

dezhin commented 3 months ago

Thanks for the explanation and 2024-06-13-beta.1 release.

brad-richardson commented 2 months ago

This has been resolved for connectors and will be reflected in the next release. Thanks for the flag!

@jwass do you want to leave this open to track extra fields generally?

dezhin commented 2 months ago

Feel free to close, we'll double check when the next release arrives.