-
Action from previous meeting:
Ian to clarify what kinds of checks he would require to clarify what he means by data quality.
**Mike to Ian**
> Hello Ian
>
> This is my thread on [Validato…
-
Produce a notebook of data quality checks and pull out key insights from this.
Data quality checks:
- [ ] Amount of missing data
- [ ] Number of duplicate businesses
- [ ] Number of businesse…
-
I have a table with **45M** records.
One of the columns has nearly **8.8 nulls**; however, the Spark Expectations null check result gives only **1M** rows as failed_row_count.
Similarly, the count…
-
Come up with a list of initial data checks to identify lots with potentially incorrect unit count.
- [x] Review PTS data and understand expected values for different kinds of lots
- [x] Identify …
-
Depends on #262 and to a lesser extent #266
# Problem Statement
Without manual analysis, it's hard to know what changed week to week in our data marts.
# Criteria for Success
NDC Description mart
-…
-
The data obtained from the WNBL / FIBA Livestats site is currently included in this package largely "as is" and needs closer checking as I expect there will be errors.
This issue will start off as …
-
- [ ] some early dates, but most beyond 2017, will need to filter
- [ ] number of patients, too few may be a problem
- [ ] number of stores
-
As underpass shifts to more focus on data quality, a document needs to be created with data quality goals. There are two types of data quality analysis. The 1st is what can be done in a minute time fr…
-
Project summary
One of our 2024 goals is to speed up the release of our datasets. This project focuses on improving the QAQC process for PLUTO, aiming to build it within a week. We'll use PLUTO…
-
General discussion of data quality checks we can perform.