OpenAddressesUK / roadmap

Open Addresses UK's roadmap. Learn more about Open Addresses at http://openaddressesuk.org/
2 stars 0 forks source link

Data quality measurement #39

Open giacecco opened 9 years ago

giacecco commented 9 years ago

Derived from @peterkwells' original at https://github.com/theodi/shared/issues/489#issuecomment-68853282 .

Once Open Addresses' "strategic definition" of address data quality is formalised (see Trello for progress), implement what is necessary to automatically calculate its quantifiable components and monitor them in time. E.g. the data freshness / currency could be measured by the average number of addresses being added / changed / deleted per day over some period.

The improvement of the OA dataset from one edition to another could also be described by a series of key performance indices, consistent with the above definition of quality, and their variation from one edition to the other, e.g. completeness.

All measurements done this way should be published, too, as open data available in both machine and human-readable formats.

Some of its components should be available for automated publishing on the OA website, e.g. typically every time a new edition is distilled we could show somewhere on the homepage the most relevant indices representing improvement.

peterkwells commented 9 years ago

Owen Boswarva has flagged that a DECC open data release could be useful for this as it contains number of electricity meters per postcode.

https://twitter.com/owenboswarva/status/560886776827740160 https://www.gov.uk/government/collections/mlsoa-and-llsoa-electricity-and-gas-estimates

This forms part of our volume quality measures. Previous spec copied below. Other quality measures will come out from the analytics dashboard work.

Flagging as ready for volume quality measure to be developed. Flagging as suitable for community work.


Principles

FIve principles apply to the quality measures:

User-Focussed - we want the measures to be meaningful to the users of the address products.

Comparability - we want all address products to be able to use the same quality measures so that users can compare different products.

Quantitative - all quality measures should be able to be expressed numerically.

Transparent - we want the measures and how they are calculated to be open and transparent. Ideally so that a third party could perform an audit or calculate the measures themselves..

Extensibility - the measures should not prohibit address products from advertising other features and benefits and should be able to be expanded and changed in the future as needs permit.

Measures

Volumes

People are interested in the completeness of address coverage in a given product. Some users will be interested in national numbers but others will be interested in the numbers for a specific area. But differing address products will have differing definitions of addresses. For example the Royal Mail PAF product is focussed on postal delivery addresses, the Open Addresses service is interested in any address, whilst future address products may want to support more granular addresses such as the installation point of an Internet of Things device.

Without a means to compare these different address definitions, and the potential total volume of addresses that each are publishing, it will not be possible to strictly compare completeness.

Therefore address products should publish a volumetric count of their “addresses” that can be compared against an independent count of objects in a given area to provide a ratio.

For example the census counts for households or dwellings per postcode sector could be used when calculating this ratio. This will allow the ratio to be determined down to specific areas as required by a particular user.

Potential sources of data:

England and Wales - http://www.ons.gov.uk/ons/rel/census/2011-census/headcounts-and-household-estimates-for-postcodes-in-england-and-wales/index.html https://www.nomisweb.co.uk/census/2011/ks101ew
https://www.nomisweb.co.uk/census/2011/ks401ew

Scotland - http://www.scotlandscensus.gov.uk/news/census-2011-population-and-household-estimates-scotland-release-1c-part-two Northern Ireland - http://www.nisra.gov.uk/Census.html (unfortunately this site doesn’t seem to allow direct linking)