ministryofjustice / find-moj-data

Find MOJ data service • This repository is defined and managed in Terraform
https://find-moj-data.service.justice.gov.uk/
MIT License
5 stars 0 forks source link

Spike: How can we reduce empty description fields? #1022

Open murdo-moj opened 3 weeks ago

murdo-moj commented 3 weeks ago

Following #965 and #967 we should have a view of the metadata population rates of our data, and we can drive targeted improvements. For entries which have a listed custodian we have more control to drive users to update metadata, however there are a good deal of entries with no custodian or description. For these entries we will need to be more proactive in either talking to data modellers to establish where this information is, or searching for it ourselves.

Options

teeceeas commented 3 days ago

The current plan is to discuss with DIP team about the issues Questions raised by Greg

  1. The roll out of the new ownership model across MoJ is obviously going to take some time (and more slowly than the role out of the catalogue), so our intent to collect the data owner/steward/custodian roles is hitting some obvious hurdles. How should we handle this going forwards? It doesn't feel right asking people to provide names for data custodians or stewards if the people we're asking don't know what those roles mean.
  2. On a similar note - is the approach we're taking with the CaDeT data correct? We've assigned those contacts as Data Custodians and will be doing some research with them, but is there more we need to do here?
  3. In scenarios where we have poor metadata on the catalogue, is there a role for the data governance team to pick up gaps and try to drive that quality up? For example, if we have several datasets without owners/stewards/custodians, who should be steering the work to address those gaps?

Figures After reviewing the data, the following dataset don't have custodians CJS Dashboard (50) - Team email address to be added Performance Hub (740) - Need to confirm from Jeremy Justice Data (1) - Just the main entry doesn't have a custodian, the underlying datasets have group email addresses CaDeT (509) - This is made up of 46 databases that need custodian info, of which 18 have custodians for the underlying tables. 28 tables still need custodians.

Delius & Oasys need to be reviewed as there seems to be some discrepancies