Closed tgerke closed 3 years ago
For EHR data, the Synthea app may be a good option. You can build out custom state transition modules which will simulate EHR data according to a distribution we assign - could be great for checking model validity against known parameters.
We can pull in Disney data from here: https://touringplans.com/walt-disney-world/crowd-calendar#DataSets, I'll stick this in an R package.
Disney data is here: www.github.com/LucyMcGowan/touringplans
devtools::install_github("LucyMcGowan/touringplans")
To explore: Tidy Tuesday may have candidates Our World in Data Sports data(?). Sabermetrics available, but baseball may be too US centric. Econ! Andrew Heiss has examples? Education, employment Psych data for mediation. Wellness/happiness (general community survey? personality studies)
A simulation exercise would be useful for demonstrating the "correct" answer and how close estimates get to identifying truth. Easier for these not to be medical.