ioos / ckanext-ioos-theme

IOOS Catalog as a CKAN extension
GNU Affero General Public License v3.0
7 stars 14 forks source link

Point symbol for dataset in Catalog? #142

Closed mwengren closed 7 years ago

mwengren commented 7 years ago

I knew I had seen one of these before. Just found one again: https://data.ioos.us/dataset/48114-king-island-buoy-grid_longitude.

I thought we were converting all point bounding boxes to small rectangles in order for CKAN/Solr spatial search to work. The side affect is the small rectangles are hard to see in the map. This actually looks better.

How did this dataset get through the registration process as a point? Does this mean spatial search won't return it as a result? Ideally we could render points and still have spatial search function...

lukecampbell commented 7 years ago

Yeah we can define points no problem, but search shouldn't yield this dataset. I'll need to investigate further.

benjwadams commented 7 years ago

NB: this doesn't come up in the spatial search, unlike the other King Island datasets which have BBox extents.

mwengren commented 7 years ago

I noticed it wasn't hit on the spatial filter, unlike the other parameters at this site. How did this particular point dataset get through the Registry without being converted to a bounding box though?

lukecampbell commented 7 years ago

We'll need to investigate

benjwadams commented 7 years ago

It's only happening for AOOS 48114 grid_longitude and grid_latitude.

nice try but no dice, barked up the wrong tree :( ~~It bears noting that ckanext-spatial automatically turns zero-area bounding boxes into points prior to storing the data in the database. https://github.com/ckan/ckanext-spatial/blob/a22c6267d7bda145ec8446b6c7357ba047bd3390/ckanext/spatial/plugin.py#L215-L216 Unfortunately, you may not be able to fetch that resultant dataset spatially depending on your choice of "spatial backend". https://github.com/ioos/catalog-harvesting/blob/2cbdc4eaae25de4b2e769c35a1e6743c812d3fba/catalog_harvesting/records.py#L190~~
~~should be comparing westBoundingLongitude to eastBoundingLongitude, but it could also be comparing extentTypeCode to another element, which would fail the equality, and not do the bounds expansion routine. See: https://geo-ide.noaa.gov/wiki/index.php?title=EX_GeographicBoundingBox and~~ look at the ISO record for https://registry.ioos.us/waf/AOOS/aoos_sensors_aoos_bering_strait_data_grid_longitude.nc.iso.xml

benjwadams commented 7 years ago

OK, after a few false leads, I believe I've tracked down what is happening.

When importing into the registry, the bounding box is modified if and only if the lower left bbox coords are equal to the upper left:

https://github.com/ioos/catalog-harvesting/blob/2cbdc4eaae25de4b2e769c35a1e6743c812d3fba/catalog_harvesting/records.py#L190

However, ckanext-spatial's harvest logic will transform BBOXes into point if either both lats are equal or/and if both lons are equal.

https://github.com/ckan/ckanext-spatial/blob/a22c6267d7bda145ec8446b6c7357ba047bd3390/ckanext/spatial/harvesters/base.py#L353-L366

48114's grid_longitude file in the previous comment has the following bbox coords, which make it unchanged through catalog-harvesting's filter, but are considered a point by ckanext-spatial since the latitudes are equal:

<gmd:westBoundLongitude><gco:Decimal>-169.45738333333333</gco:Decimal></gmd:westBoundLongitude>
<gmd:eastBoundLongitude><gco:Decimal>-169.45345333333333</gco:Decimal></gmd:eastBoundLongitude>
<gmd:southBoundLatitude><gco:Decimal>65.009</gco:Decimal></gmd:southBoundLatitude><gmd:northBoundLatitude><gco:Decimal>65.009</gco:Decimal></gmd:northBoundLatitude>

Logs for the harvesters corroborate this, as they contain log entries corresponding to the logic. Also note that the spatial info of the dataset does not get indexed by Solr, since it can't handle the point geometries with the solr spatial backend in use.

2017-03-16 17:23:22,483 DEBUG [ckanext.spatial.harvesters.base.import] Import stage for harvest object: f39e1516-8b58-432f-8a41-4be02bbfd660
2017-03-16 17:23:22,492 DEBUG [ckanext.spatial.validation.validation] Starting validation against profile(s) iso19139ngdc
2017-03-16 17:23:22,534 DEBUG [ckanext.spatial.validation.validation] Validated against "ISO19139 XSD Schema (NGDC)"
2017-03-16 17:23:22,534 INFO  [ckanext.spatial.validation.validation] Validation passed
2017-03-16 17:23:22,548 WARNI [ckanext.spatial.model.harvested_metadata] Value not found for element 'title'
2017-03-16 17:23:22,571 DEBUG [ckanext.harvest.model] Point extent defined instead of polygon
2017-03-16 17:23:22,571 INFO  [ckanext.ioos_theme.harvesters.ioos_harvester] Checking for responsible-organisation
2017-03-16 17:23:22,605 ERROR [ckanext.spatial.plugin] Solr backend only supports bboxes (Polygons with 5 points), ignoring geometry {"type": "Point", "coord
inates": [-169.457383333, 65.009]}
2017-03-16 17:23:22,651 INFO  [ckanext.spatial.harvesters.base.import] Document with GUID aoos:bering_strait:grid_longitude unchanged, skipping...
mwengren commented 7 years ago

Ok, thanks @benjwadams. I think we have a good explanation of what's going on in this case. Since it's only these particular datasets (grid_longitude, and grid_latitude for AOOS 48114 ) which aren't really too important I don't think (since the obs are reflected on other datasets associated with the same station) we can probably ignore.

We have this issue in the GitHub issue history in case this comes up for other datasets.

@lukecampbell is there a fix we can make to the catalog-harvesting module to expand slightly in case either lat or lon dimension is equal, rather than only for both being equal (point extents). I'm not sure it's really worth the effort though, this probably won't happen too often.