LearningToTalk / L2TDatabase

Helper functions for working with our lab's MySQL database
GNU General Public License v2.0
0 stars 0 forks source link

add query for max maternal education within household #48

Closed tjmahr closed 7 years ago

tjmahr commented 7 years ago

Finally... what all this sibling-sleuthing and household documenting has been working towards.

Write a SQL query that gets the max education of maternal caregivers in a household.

We need the max because a few families have multiple mothers. Plus there are families where the grandmother is a primary caregiver.

I started a first draft of such a query a couple months ago, so I'll work from there.

tjmahr commented 7 years ago

And finally, after a week of data sleuthing/entry/cleaning, maternal education!

First, the helper queries. These help us clean up the educational values and aggregate down to one maternal education level per household. They are not very useful on their own.

Most usefully, I updated our three main study info queries q_Scores_TimePoint1, q_Scores_TimePoint2 and q_Scores_TimePoint3 to include columns for Maternal_Caregiver, Maternal_Education and Maternal_Education_Level.

Demo: count maternal education levels at TimePoint1

library(L2TDatabase)
library(dplyr)

l2t <- l2t_connect("./inst/l2t_db.cnf")
t1 <- l2t %>% tbl("q_Scores_TimePoint1") %>% collect()

t1 %>% 
  count(Maternal_Education_Level, Maternal_Education) %>% 
  rename(nParticipants = n) %>% 
  ungroup()
#> # A tibble: 10 × 3
#>    Maternal_Education_Level           Maternal_Education nParticipants
#>                       <int>                        <chr>         <int>
#> 1                         1        Less Than High School             5
#> 2                         2                          GED             7
#> 3                         3          High School Diploma            12
#> 4                         4      Some College (<2 years)             3
#> 5                         5      Some College (2+ years)            15
#> 6                         5 Technical/Associate's Degree            16
#> 7                         5                 Trade School             2
#> 8                         6               College Degree            78
#> 9                         7              Graduate Degree            75
#> 10                       NA                         <NA>            11

In practice, we would probably reduce the 7-point scale down to a 3- or 4-point scale.

janroslynedwards commented 7 years ago

Dear Tristan,

This is awesome! Are we still imputing maternal ed for participants who completed the questionnaire but did not answer this question (at any time point)?

Jan


From: TJ Mahr notifications@github.com Sent: Monday, December 12, 2016 1:35:20 PM To: LearningToTalk/L2TDatabase Subject: Re: [LearningToTalk/L2TDatabase] add query for max maternal education within household (#48)

And finally, after a week of data sleuthing/entry/cleaning, maternal education!

First, the helper querieshttps://github.com/LearningToTalk/L2TDatabase/blob/cc881bdc92de678825166b4e9b176a802d4f49a3/inst/views/caregiver_education.sql. These help us clean up the educational values and aggregate down to one maternal education level per household. They are not very useful on their own.

Most usefully, I updated our three main study info queries q_Scores_TimePoint1, q_Scores_TimePoint2 and q_Scores_TimePoint3 to include columns for Maternal_Caregiver, Maternal_Education and Maternal_Education_Level.

Demo: count maternal education levels at TimePoint1

library(L2TDatabase) library(dplyr)

l2t <- l2t_connect("./inst/l2t_db.cnf") t1 <- l2t %>% tbl("q_Scores_TimePoint1") %>% collect()

t1 %>% count(Maternal_Education_Level, Maternal_Education) %>% rename(nParticipants = n) %>% ungroup()

> # A tibble: 10 ? 3

> Maternal_Education_Level Maternal_Education nParticipants

>

> 1 1 Less Than High School 5

> 2 2 GED 7

> 3 3 High School Diploma 12

> 4 4 Some College (<2 years) 3

> 5 5 Some College (2+ years) 15

> 6 5 Technical/Associate's Degree 16

> 7 5 Trade School 2

> 8 6 College Degree 78

> 9 7 Graduate Degree 75

> 10 NA 11

In practice, we would probably reduce the 7-point scale down to a 3- or 4-point scale.

- You are receiving this because you are subscribed to this thread. Reply to this email directly, view it on GitHubhttps://github.com/LearningToTalk/L2TDatabase/issues/48#issuecomment-266529122, or mute the threadhttps://github.com/notifications/unsubscribe-auth/AHAHM3hXI7rMc34Mrc9Zg_7x7x77sPurks5rHaH4gaJpZM4LJZ2M.

tjmahr commented 7 years ago

We are not imputing right now. We caught most of the NAs by using information from later surveys, so there is only one missing maternal education in year 2 and year 3 of the longitudinal study. (Plus, the original imputation code no from 2014 longer works, or it didn't when I tried in 2015. )

janroslynedwards commented 7 years ago

27 low SES (less than 2 years of college) participants at TP1, 13 at TP2, 8 at TP3.

This is depressing...