digital-land / technical-documentation

Technical Documentation for the planning data service.
https://digital-land.github.io/technical-documentation/index.html
0 stars 0 forks source link

New data quality check: entity out of LPA bounds #115

Open Ben-Hodgkiss opened 6 days ago

Ben-Hodgkiss commented 6 days ago

Overview

Data management need a new data quality assessment implemented to detect when geometries supplied by LPAs are beyond the expected LPA boundary.

We think (after discussion with @eveleighoj and @psd ) that this should be run at the dataset level, rather than resource, by picking up the expectations work because we don’t want to remove geometry facts identified as having issues.

This should apply to:

Jupyter notebook with demo code: https://github.com/digital-land/jupyter-analysis/blob/main/analysis/2024-08_geo_issues_demo/geo_issues_demo-bounds.ipynb

Tech Approach Suggestions:Can we run the expectations on the dataset.csv or sqlite file?” We already have code by@chrisjohns51 at dataset level on sqlite file which has expectations running for retired entities. Note: There is some work done by@carloscoelho87 for LPA boundry check on brownfield-land.

Instead of saving the rowids in expectation issue/results, can we use entity as it are easily available in dataset csv files.

.

May need an initial call with \@:63eba8f93f5f32273d83eb78 to discuss approach so far and how it can be productionised.

Acceptance Criteria/Tests

Ben-Hodgkiss commented 6 days ago

Jira Link: http://dluhcdigital.atlassian.net/browse/DATA-853

From OE (Trello, 25/09/24): Will need to revive expectations work. Worth checking out Entity organisation by eveleighoj · Pull Request #21 · digital-land/conservation-area-collection. We should take this test from a csv and output one result. Rmove the current expectation_issue stuff