langcog / wordbank

open repository of children's vocabulary data
http://wordbank.stanford.edu
GNU General Public License v2.0
64 stars 10 forks source link

Floccia bilingual data #251

Closed mcfrank closed 2 years ago

mcfrank commented 2 years ago

Datafile contains wordbank data.xlsx

Contains data on bilinguals for 100 word oxford CDI (production/comprehension) plus 30 words across various languages.

probably needs some codebook support on momed and other fields (or we can try to back this out of the monograph)

mcfrank commented 2 years ago

Our decision is to import JUST the oxford CDI words (not the 30 extra words) and to mark the language background appropriately for these kids. @kachergis might just use the 30 extras for swadesh validation.

alvinwmtan commented 2 years ago

Policy for short forms:

alvinwmtan commented 2 years ago

@HenryMehta cleaned files are here:

[English_British_OxfordShort].csv EnglishBritishOxfordShort_Floccia_data.csv EnglishBritishOxfordShort_Floccia_fields.csv EnglishBritishOxfordShort_Floccia_values.csv

Note that data file only contains children who have English data

HenryMehta commented 2 years ago

@alvinwmtan is the form_type for OxfordShort WS or WG as defined in #258 ?

alvinwmtan commented 2 years ago

@HenryMehta it should be WG—thanks for flagging

HenryMehta commented 2 years ago

@alvinwmtan Available to test

alvinwmtan commented 2 years ago

@HenryMehta looks good from R

alvinwmtan commented 2 years ago

citation is: Floccia, C., Sambrook, T. D., Luche, C. D., Kwok, R., Goslin, J., White, L., Cattani, A., Sullivan, E., Abbot-Smith, K., Krott, A., Mills, D., Rowland, C., Gervain, J., Plunkett, K., Hoff, E., & Bauer, P. J. (2018). Vocabulary of 2-year-olds learning English and an additional language: Norms and effects of linguistic distance. Monographs for the Society for Research in Child Development, 83(1), 1–235. https://doi.org/10.1111/mono.12352

contributor: Caroline Floccia, University of Plymouth