NSAPH-Projects / mcbs-mbsf-exploratory

MCBS x MBSF 2015 exploratory analysis
0 stars 0 forks source link

QC #2

Open atrisovic opened 2 years ago

atrisovic commented 2 years ago

Up-to-date

Obsolete

daniellebraun commented 2 years ago

hi, these are really strong assumptions, for examples zip codes may change over years (as people move) and we would like to keep both, for each year we would like a zipcode. i also don't think we should disregard race completely if there are inconsistencies. also for hmo mo its only 0 for the second dataset on cardio outcomes, for the first we aren't making this restriction.

atrisovic commented 2 years ago

Hey,

daniellebraun commented 2 years ago
atrisovic commented 2 years ago

Yes, she merged everything (👏) and this issue is a checklist for QC.

daniellebraun commented 2 years ago

i dont think we can get away with one dataset since it will cut our mortality data in half if we restrict to hmo=0 and for mortality outcome there is no need for such restriction... since we are so tight on sample size i dont think we can afford this.

atrisovic commented 2 years ago

No no, we keep all hmos and have a single dataset with both cvds and dods (and the hmos).

(The two datasets are essentially one selection away (hmo==0), so most of the data would be duplicated in that case.)

daniellebraun commented 2 years ago

yeah just save one big data, but then subset the data for him as requested, the mortality file doesnt need to restrict to hmo, the cvd file needs to restrict to 0 hmo but will need mortality info as well as mortality is a censoring event if it happens before cvd

laurenflynn commented 2 years ago

I currently have the data set restricting to hmo_mo==0 but I can remove this filter so that it can be filtered later during the analysis. Then I will work on the QC checkpoints Ana has listed.