CorrelAidSwitzerland / a4d

The project repository for the Data4Good project in collaboration with A4D
Other
1 stars 0 forks source link

data quality report #68

Open pmayd opened 1 year ago

pmayd commented 1 year ago

With what we know about Looker Studio it should be easy/feasible to create a data quality dashboard/report that list important statistics to all columns of interest, like patient data at the moment.

Output could be a dashboard with a table and/or graph for each column in patient data listing interesting statistics like number of non-empty entries, number of missing entries, min, max, avg for continuous variables, number of distinct values and most common values for categorical data, etc.

lboel commented 11 months ago

Quick win maybe a R Shiny app based on this https://appsilon.com/automated-r-data-quality-reporting/. Not the best option but at least it`s a quick approach without much effort

pmayd commented 11 months ago

Absolutely for it, if it is easy to use and applicable why not we don't have to reinvent the wheel. And there are also tools for GCP of course so we should also invest services from Google that analyze the data and show the lineage for example!