ckan / ckanext-dcat

CKAN ♥ DCAT
https://docs.ckan.org/projects/ckanext-dcat
168 stars 148 forks source link

Spatial coverage not accepting WKT #320

Open jematson opened 2 weeks ago

jematson commented 2 weeks ago

We are running CKAN 2.10 on docker containers with ckanext-dcat installed with the profile set as ckanext.dcat.rdf.profiles = euro_dcat_ap_3 and ckanext.dat.output_spatial_format left as the default wtk. We have been testing out the rdf exposure of the spatial coverage field by looking at the .jsonld and .xml endpoints. The _parse_geodata function in the base profile (ckanext-dcat/ckanext/dcat/profiles/base.py) indicates that both WKT and GeoJSON formats should be accepted, but only GeoJSON input seems to work.

Entering a GeoJSON string such as {"type": "Point", "coordinates": [102.0, 0.5]} results in a WTK parsing at the .xml endpoint.

<dct:spatial>
  <dct:Location rdf:nodeID="N22187f65232347ffb027d17c3d6a1368">
    <dcat:centroid rdf:datatype="http://www.opengis.net/ont/geosparql#wktLiteral">POINT (102.0000 0.5000)</dcat:centroid>
  </dct:Location>
</dct:spatial>.

Setting the output_spatial_format as geojson also gives a correct output.

<dct:spatial>
  <dct:Location rdf:nodeID="N424c9233272c47ec9ae58bec46b872ee">
    <dcat:centroid rdf:datatype="https://www.iana.org/assignments/media-types/application/vnd.geo+json">{"type": "Point", "coordinates": [102.0, 0.5]}</dcat:centroid>
  </dct:Location>
</dct:spatial>

When entering a WTK string such as POINT (102.0 0.5), however, nothing shows up at the endpoint, in WTK or GeoJSON format.

<dct:spatial>
  <dct:Location rdf:nodeID="N58fb40312991454495e1a7066ad098c2"/>
</dct:spatial>

Changing to a different profile setup, such as ckanext.dcat.rdf.profiles = euro_dcat_ap_2 euro_dcat_ap_scheming does not change the behaviour. Is the spatial coverage field meant to accept WTK input, or just GeoJSON?

amercader commented 2 weeks ago

@jematson the various geometry fields in spatial coverage currently only accept GeoJSON as an input, which is the format supported in other integrations like ckanext-spatial. ckanext.dcat.output_spatial_format allows to define the format used in the DCAT serializations but if wkt is used, this is currently generated from GeoJSON. _parse_geodata() reads the geometry from a DCAT serialization (wkt or geosjon) and transforms it to GeoJSON for storage in CKAN.

It wouldn't be a massive task to support WKT as input but then support for it would have to be added to ckanext-spatial. If is there a need for that we could consider implementing it.

In the meantime, if you prefer to input WKT you can add a custom validator that transforms it to GeoJSON under the hood but note that the value will get stored as geojson (and so returned as such in the CKAN API):

    - field_name: geom
      label: Geometry
      validators: ignore_missing to_geojson
def to_geosjon(value):
     # We assume user entered WKT
     try:
         cur_value = json.dumps(wkt.loads(str(geometry)))
     except (ValueError, TypeError):
        raise Invalid("Could not convert to GeoJSON")