Open pmayd opened 1 year ago
Quick win maybe a R Shiny app based on this https://appsilon.com/automated-r-data-quality-reporting/. Not the best option but at least it`s a quick approach without much effort
Absolutely for it, if it is easy to use and applicable why not we don't have to reinvent the wheel. And there are also tools for GCP of course so we should also invest services from Google that analyze the data and show the lineage for example!
With what we know about Looker Studio it should be easy/feasible to create a data quality dashboard/report that list important statistics to all columns of interest, like patient data at the moment.
Output could be a dashboard with a table and/or graph for each column in patient data listing interesting statistics like number of non-empty entries, number of missing entries, min, max, avg for continuous variables, number of distinct values and most common values for categorical data, etc.