Open stevelewis99 opened 6 years ago
Also, if we really don't want all the extra column mappings from this 'gridlink' product (I think it would be useful though if we can?) then this is another source for purely geocoding from postcodes and nothing else. https://www.ordnancesurvey.co.uk/business-and-government/products/code-point-open.html
HERE told us it's exactly what they use.
But the data in the national postcode lookup table comes from ordnance survey, so it's all the same data.
User guide here too with all the details, if needed. https://data.gov.uk/dataset/e7308379-35af-46e6-a570-8825fec1e008/national-statistics-postcode-lookup-may-2018-user-guide
@hannahblue any ideas if/when we could prioritise?
We're putting in place 3 days every 6 weeks for the Data Team to update and add data sets in the DO. We have some other things in the queue like updating the Census for Canada, US, and Australia as well as requests for adding Japan data.
We also have a number of geocoding improvements on the roadmap to do things like return metadata, and let users store the results of geocodes among other things.
We'll need to figure out what the level of effort is for implementing this, where it fits in the priority, and whether this can be added during one of the 3 data update days or needs to be part of the geocoding improvement project. I definitely understand the value, and want to get this in as soon as it makes sense. Thanks for raising the issue.
cc @juanignaciosl
Context
Currently the internal geocoder in CARTO doesn't really support the geocding of UK postcodes very well, success rate and accuracy is poor - but only due to missing data. Reasons to address:
Manual investigation
The Office for National Statistics releases a CSV: National Statistics Postcode Lookup (Latest) Centroids on a regular basis (I think 6 months)
This lookup table allows matching of a full UK postcode to lat/long (as well as a huge amount of other boundary systems.. if we wanted to leverage that later too, maybe in the DO?... but for now I'm just focussed on mapping postcode strings to their centroids).
Here's a screenshot of the columns in the file:
To test this dataset could work, I did the following:
Reduced the column set down, to only include the postcode, latitude, longitude, positional quality
Removed all whitespace from the postcode, renamed to postcode_nospace
Uploaded to CARTO (where the import guessing converted the latitude and longitude into points)
Wrote a db function to take a postcode string from a source dataset to be geocoded, remove the white space from the string, select the_geom from the lookup dataset created in [3]
Used sql update statement on the source dataset to write into the_geom the result from the function.
It seems to work perfectly. I have to go to a customer (GVA) and train them on how to do this manual process, give them the function etc. as a short term fix for them.
But it's pretty much doing the same job as the internal geocoder, from what I understand - so it would be AWESOME if we could just massage this lookup data into our existing mechanism.
It could save us money on third party services, as well as make us look way better, and most importantly customer experience would improve too when using UK postcodes (common scenario).
If you have any questions or want more info please reach out to me!
cc @hannahblue
p.s. in future this data could also support reverse geocoding .. given a point, find the nearest postcode.
To do (I'll let you rewrite these I'm just inventing steps here)