LearningToTalk / L2TDatabase

Helper functions for working with our lab's MySQL database
GNU General Public License v2.0
0 stars 0 forks source link

investigate no-data kids #18

Open tjmahr opened 8 years ago

tjmahr commented 8 years ago

Working on my vocabulary growth paper.... I've got 14 kids with no ppvt, evt during the first two years.

d %>% 
  filter(is.na(EVT_GSV_T1), is.na(PPVT_GSV_T1), 
         is.na(PPVT_GSV_T2), is.na(PPVT_GSV_T2)) %>% 
  getElement("ResearchID") %>% 
  sort
#> [1] "048L" "059L" "070L" "102L" "601L" "617L" "618L" "621L"
#> [9] "635L" "648L" "650L" "662L" "672L" "687L"

I know there are cases where we assigned IDs for recruited children but those children never made it into the lab so never participated and never contributed data. Need to identify those cases and remove them from the database. Otherwise, we will overstate our Ns.

Some of these 14 presumably are those kinds of kids. A few might be kids who contribute partial data in year one (but not the evt or ppvt) before dropping out.

janroslynedwards commented 8 years ago

Can we ask the DIRT team to identify these kids?

tjmahr commented 8 years ago

That's what I was thinking.

janroslynedwards commented 8 years ago

648, 662, 672, 687 aren't even in participant info excel file.

601, 617, 618: no tasks except fruit stroop (617 also min pair)

621 no tasks at all

635 eyetracking, fruit stroop, verbal fluency

650 min pairs only

48 no tasks except one block of eye tracking (98% missing data)

59 only eye tracking and min pair ("possible unidentified language disorder")

70 only completed BRIEF

102 only eye tracking completed

All of these children should be excluded from data base (they were our old "excludes", at least for UW).

tjmahr commented 8 years ago

Thanks Jan!

First four kids don't really exist basically. For the others, I'm willing to bet they only came into the lab for just one visit, and they just weren't ready for our tasks. ("Ten additional children were recruited and participated in a lab visit but were not developmentally ready for our assessment battery.")

marybeckman commented 8 years ago

On Mar 5, 2016, at 1:54 PM, janroslynedwards notifications@github.com wrote:

648, 662, 672, 687 aren't even in participant info excel file.

There are TimePoint1 files on tier3 for 687L for the MP and RWL tasks and there are stimlog files for him for the RWR task on both tier3 and tier2, albeit no corresponding wav file and no record of a search for the wav file in either of these “lab notebook” tables:

https://l2twiki.slhs.umn.edu/Learning2Talk/L2TRealWordSegmIssuesTP1#table4 https://l2twiki.slhs.umn.edu/Learning2Talk/RealWordRepTP1LabNotebook#longitudinal

tjmahr commented 8 years ago

Looks like 687L was excluded because they "withdrew", according to attrition spreadsheet from an email circa 1/21/2015.

There used be recordings with this participant's id. I know this because I asked @aajohnson4 to fix all multi-part RWR recordings. For 687, she documented:

RealWordRep_687L32MS1.WAV
#Archived. Training session with examiners (Participant didn’t show up). I also put the .edat and .txt files into Archive
RealWordRep_687L32MS1_part1.WAV
#Archived. Training session with examiners, part 1
RealWordRep_687L32MS1_part2.WAV
#Archived. Training session with examiners

Not sure if yet if the child actually ever did do to the task.

Anyway, okay, so the problem is a bit more complicated. Being in/out of the participant info spreadsheet is not a reliable indicator of whether a child came into the lab for the study.

marybeckman commented 8 years ago

On Mar 5, 2016, at 2:48 PM, TJ Mahr notifications@github.com wrote:

Looks like 687L was excluded because they "withdrew", according to attrition spreadsheet from an email circa 1/21/2015.

This is totally outside my ken, but could it be worth your while to check files such as these:

ScoringandChecking_UMN_Longitudinal_4_21_2014.xlsx ScoringandChecking_UMN_Longitudinal_6_7_2014.xlsx

in tier3:ParticipantData/Minnesota/Longitudinal/Participants directory?

tjmahr commented 8 years ago

I'll coordinate with DIRT on sorting out what's going on with these partial-data children.