Public-Health-Scotland / phsmethods

An R package to standardise methods used in Public Health Scotland (https://public-health-scotland.github.io/phsmethods/)
https://public-health-scotland.github.io/phsmethods/
54 stars 13 forks source link

Data profiling #51

Closed davidc92 closed 2 weeks ago

davidc92 commented 3 years ago

A function to carry out data profiling as typically required by data management. What I mean by that is:

I strongly suspect this ask may be fulfilled by another package, but if not I don't think this should be too tricky a function to write.

jvillacampa commented 3 years ago

As you say there are a couple of packages out there for this purpose. See this one: https://towardsdatascience.com/simple-fast-exploratory-data-analysis-in-r-with-dataexplorer-package-e055348d9619 And functions to do this type of jobs in different packages: https://www.r-bloggers.com/2018/08/exploratory-data-analysis-in-r-introduction/ Things like your second and third points can be done in base R (range(df$date); table(df$factor)

There might specific jobs for which developing a new function is required, but perhaps, it's more a matter of having a guide on how to do this type of stuff?

alice-hannah commented 3 years ago

This function might be similar to what you're after too - https://github.com/jackhannah95/jafun/blob/master/R/prop_missing.R