Closed charvolant closed 3 years ago
Also, to be tested:
Process stateProvince and country but only if not derived from lat/long.
This is done for stateProvince
in the extensions to LocationInterpreter
Do we have any grid reference
data for ALA? I have a feeling the UK NBN require this code.
I don't think we should be parsing verbatim
values, as they should only be provided as a visual check for non-verbatim versions (guess). They also tend to be pretty un-uniform which makes parsing a bit hit and miss anyway.
No decimal lat/lng could be derived from geospatial_kosher via -geospatial_kosher:*
.
Reproject onto the supplied coordinatePrecision (rounded to 6 decimals as a default)
Here is a brief explanation on why 6 digits it's been used https://github.com/gbif/pipelines/issues/466
We ticked off some things: eastings/northings (around 40K in biocache with unknown decimalLat/Long), or verbatim coordinates grid reference system is the UK ordinate survey grid spatially-invalid issue in gbif should cover missing decLat/Long not processing any verbatim fields
Most of this functionality seems to come from a time when we were working hard to extract and standardise any location information that we could.
The coordinate precision flags and processing is a bit more important however. @M-Nicholls could you comment on the usefulness of the Missing Coordinate Precision flag, or the Too Precise flags?
Also - if the Missing Coodinate Precision flag plays a part of the spatially-invalid flag - maybe we should keep it.
In terms of processing based on coordinatePrecision - eg making sure that reprojections maintain the coordinate precision. This is important, because reprojections create lots of decimal places. See TDWG paper on Georeferencing Best Practice
Re Missing Coordinate Precision flag, and the Too Precise flags
They are useful bits of data in assessing whether to trust the data as provided - too many decimal points is an indicator that something's dodgy - usually a conversion (datum or coordinate system) or georeferencing artifact. Missing coordinate precision means you can't check if the coords meet the expected precision so you can't validate or improve the record.
These look like pretty straight forward checks, any reason not to include them?
See #242 The GBIF LocationTransform/LocationInterpreter does country lookups but not stateProvince. The ALA LocationTransform does both.
From @djtfmartin on #322
I don't think we should (re)implement a flag for missing coordinate precision for 2 reasons:
GBIF arent doing this (and i think the reason why is (2) Of the data in ALA, only 390,314 records have this value. That mean we will flag 99.5% of records with this problem which doesn't seem to be a useful thing to do. I think this will skew (as it does currently) a lot of dashboard-style breakdowns of data quality issues for datasets. Sorry for not commenting on the other issue.
cc @M-Nicholls @javier-molina
Coordinate Precision is going away for the reason above. This has already been capture in end user documentation and the remaining task for it is #372
Functionality present in the biocache-store location processor that is either missing or sufficiently different to need a review.
Missing or questionable functionality