department-for-transport-public / D-TRO

Digital Traffic Regulation Orders (D-TRO)
MIT License
3 stars 1 forks source link

Geometry encoding in samples vs spec #7

Closed Karol-Kolenda closed 1 month ago

Karol-Kolenda commented 1 month ago

It appears there are discrepancies between the Beta-01-DfT-D-TRO-Data Model-User Guide-3.2.2_v1.0.pdf and the newly added samples, such as JSON-example-new-dtro-3.2.2.json.

The newly added samples use geometries encoded in GeoJSON format, whereas the User Guide (page 29) specifies that geometries should be encoded as WKT. Additionally, the samples seem to use the WGS84 coordinate system, rather than the British National Grid (EPSG:27700, as mentioned on page 28). I might be mistaken, but the lat/long coordinates provided point to Minnesota, US. Even if these are just sample coordinates that resemble lat/long values, they fall outside of EPSG:27700 since one of the coordinates is negative.

Furthermore, the older example (DTRO-v3.2.0-slim.json) takes a hybrid approach, mixing GeoJSON with WKT. In this case, the coordinates are encoded in WKT format but without the proper geometry type, for example, "coordinates": "((30 10, 40 40, 20 40, 10 20, 30 10))".

This is unusual, as it doesn't conform to valid WKT syntax, which should look like "POLYGON((30 10, 40 40, 20 40, 10 20, 30 10))". Nevertheless, this format can still be workable.

In conclusion, it would be very helpful to clarify the geometry encoding standards. We currently have three different approaches: WKT as outlined in the guidelines, GeoJSON in the newest examples, and a hybrid approach in the older examples.

Thank you in advance for any clarification.

JHB9876 commented 1 month ago

On behalf of DfT: Many thanks for this comment. We are reviewing and will come back on this comment shortly.

JHB9876 commented 1 month ago

On behalf of DfT: Many thanks for raising this with us. We are aware of the problem with the consistency of the data model, JSON schema and the example files in the Data Specification release v.3.2.2. We have been addressing this, and have a further planned Data Specification release 3.2.3 in the week commencing 23 September. The initial Minimum Viable Product (MVP) service for Private Beta will use WKT coding for spatial geometry, that the planned Data Specification will reflect.

There is a broader discussion concerning choices of coordinate referencing systems and the choice of standardised formats for the encoding and supply of spatial data in the context of D-TRO, which we continue to examine.

stm-john-cooper commented 1 month ago

Thanks JHB for the full comment. I would like to endorse the request for discussion on this particular issue and the wider implications concerning coordinate referencing systems.

gbkls commented 1 month ago

Following this up.

Now the integration environment is up we downloaded what appears to be the only record, this has a coordinate record as below

[{"geometry":{"crs":"osgb36Epsg27700","coordinates":{"type":"Polygon","coordinates":[[[-104.05,48.99],[-97.22,48.98],[-96.58,45.94],[-104.03,45.94],[-104.05,48.99]]]},"geometryType":"polygon"},

Its not clear what the coordinates here are, they are not OSGR or Lat, Long. They could be some sort of offset from a OSGR reference point , but the data does not appear to contain that

Appreciate clarification

stm-john-cooper commented 1 month ago

Thanks @gbkls for this and your other comments. We will have full replies soon.

JHB9876 commented 1 month ago

On behalf of DfT: Thank you for raising the issue around geometry encoding inconsistencies. As part of the latest release, we have now addressed these as follows:

Please refer to the Table 2 (WKT Encoding) in the latest User Guide for further detail, including codified examples.

Validation Rules: The following two specific types of validations are also included relating to the WKT referencing:

  1. Ensuring the form of the geometry is valid:

    • A POINT should contain one pair of coordinates.
    • A LINESTRING should contain at least two pairs of coordinates.
    • A POLYGON should contain at least FOUR pairs of coordinates, and the first and last pairs should be the same to ensure it is a closed polygon.
  2. The supplied coordinates lies within the entire Great Britain bounding box of polygon: "SRID=27700;POLYGON((0 0, 700000 0, 700000 1300000, 0 1300000, 0 0))"

As mentioned above, we will continue the broader discussion concerning the choice of standardised formats.

Karol-Kolenda commented 1 month ago

Thank you for the feedback – it's perfect. I like that you keep WKT as it means faster parsing of JSON (since GeoJSON will not have to be parsed at this level; the whole geometry is treated a string value).

Karol-Kolenda commented 1 month ago

👍