Closed mcfrank closed 7 months ago
Here are the definitions:
Sex: (sexe)
1 = nena = girl 2 = nen = boy
Birth order (order_naixement):
1 = first, 2 = second, 3 = third 4 = later
Bilingualism (MonoBiling): (not sure what exactly these definitions mean)
1 = monolingual 2 = "family bilingualism" (bilinguisme familiar) 3 = "other bilingualisms" (Altres bilinguismes)
Maternal Education (escola_mare):
1 = without schooling 2 = primary 3 = secondary 4 = university
On Tue, Mar 22, 2022 at 11:38 AM Michael Frank @.***> wrote:
From Isabel Serrat
Here I attach the spps file of the Catalan MCB-CDI-I with the data of the variables sex, age in months, birth order, number of siblings, mother's educational level and bilingualism.
Note about one participant in the template: We detected two participants with the same number (id 581). We do not know what could have happened, but we only have one questionnaire that corresponds to that number. If we do not know what happened, we will have to delete the information about that participant. I will let you know which one it is.
Plantilla variables CDI-I Wordbank maig 22.sav.zip https://github.com/langcog/wordbank/files/8327046/Plantilla.variables.CDI-I.Wordbank.maig.22.sav.zip
@vmarchman https://github.com/vmarchman maybe you can convert from SAV and reupload?
— Reply to this email directly, view it on GitHub https://github.com/langcog/wordbank/issues/240, or unsubscribe https://github.com/notifications/unsubscribe-auth/AB2TUTH2OG37DKIU7VJYSI3VBIHRHANCNFSM5RLXAXHQ . You are receiving this because you were mentioned.Message ID: @.***>
This is the data as a csv file
Actually, I don't think this can be the data because it is only the child info rather than the administration responses
oops - we need to get the scans rekeyed! https://drive.google.com/drive/u/1/folders/1l3ZBAAu1UxeWHvg5kuMuV_903VCLXpy-
And here's the WS data as well: Metadata in SPSS Here are the scans: https://drive.google.com/drive/folders/184w5qyLRcQh7XNtilzNPTh_IQ2niTDpX?usp=share_link Plantilla variables CDI-II Wordbank 3_23.sav.zip
@rbzsparks is going to work on flatworld rekeying
WG: [Catalan_WG].csv CatalanWG_Serrat_data.csv CatalanWG_Serrat_fields.csv CatalanWG_Serrat_values.csv The demographics for child 581 were taken from the first entry in the metadata (there are two entries as noted above)
WS pending rekeying
@mcfrank Not sure how much we want to dive into this, but in the WS, the file labelled 545-558_bo.pdf
appears to have child IDs 545, 557, and then 507–517 (which are duplicated from the file 500-517.pdf
). Maybe send an email to the contributor to ask if they have the original scan? (If not I'll discard the duplicates)
[Catalan_WS].csv CatalanWS_Serrat_data.csv CatalanWS_Serrat_fields.csv CatalanWS_Serrat_values.csv Catalan_notes.md
Missing scans ignored; remaining WS data as above
@alvinwmtan What are contributor and citation for this dataset?
Contributor: Elisabet Serrat Sellabona, Universitat de Girona @mcfrank do you know what the citation is for this?
@alvinwmtan Line 48 has an age of NA. We cannot import this record without a valid age. What would you like me to do?
Guessing it's: https://www.torrossa.com/en/resources/an/5155491
Mike
On Mon, Dec 4, 2023 at 11:30 AM Alvin Tan @.***> wrote:
Contributor: Elisabet Serrat Sellabona, Universitat de Girona @mcfrank https://github.com/mcfrank do you know what the citation is for this?
— Reply to this email directly, view it on GitHub https://github.com/langcog/wordbank/issues/240#issuecomment-1839335016, or unsubscribe https://github.com/notifications/unsubscribe-auth/AAI25F6Y5UKLIZ3A4KG32BDYHYQDPAVCNFSM5RLXAXH2U5DIOJSWCZC7NNSXTN2JONZXKZKDN5WW2ZLOOQ5TCOBTHEZTGNJQGE3A . You are receiving this because you were mentioned.Message ID: @.***>
@alvinwmtan Line 48 has an age of NA. We cannot import this record without a valid age. What would you like me to do?
@HenryMehta Sorry for not catching this, let's exclude all the participants with age NA (there are a few of them, not just this one)
@alvinwmtan deployed to dev for testing
WG: [Catalan_WG].csv CatalanWG_Serrat_data.csv CatalanWG_Serrat_fields.csv CatalanWG_Serrat_values.csv The demographics for child 581 were taken from the first entry in the metadata (there are two entries as noted above)
WS pending rekeying
@HenryMehta could you also deploy the WG? Thanks!
@alvinwmtan sorry, missed it. Deployed to dev now
@HenryMehta WG looks good. For WS, I realised that we should be using the "edat" column, not "age"—this should have much fewer NAs.
@alvinwmtan It is already using edat
@HenryMehta hmm, in that case there should be 859 administrations, not 605 (which it is currently)
@alvinwmtan There are 605 rows in the data file
@HenryMehta this file has 866; perhaps you manually filtered out those with age NA and not edat NA?
@alvinwmtan Those without an age cannot be loaded so I deleted from dataset
@HenryMehta yes—those without edat
should be deleted (not those without the column labelled age
, which is not in fact the data_age
column for this dataset)
@alvinwmtan deployed
@HenryMehta looks good now, thanks!
From Isabel Serrat
Plantilla variables CDI-I Wordbank maig 22.sav.zip
@vmarchman maybe you can convert from SAV and reupload?