langcog / childes-db

A SQL interface for the CHILDES child language corpora
13 stars 5 forks source link

No Zhou3 in utterances #45

Closed jklafka closed 3 years ago

jklafka commented 5 years ago

There's a Zhou3 Mandarin corpus in CHILDES (https://childes.talkbank.org/access/Chinese/Mandarin/Zhou3.html) and get_corpora (from childes-r) shows that there's a Zhou3 corpus in childes-db, but there aren't any Zhou3 utterances in the utterance table. Thanks!

smeylan commented 5 years ago

Confirmed this is also a problem in the draft 2019.1 db also

alecristia commented 5 years ago

Actually, at least the following are missing: BolKuiken, Chroma, Jamaican, Berman, CCLAS, Yamaguchi, Paris, Goadrose, Tsay, Zhou3. I wondered whether this affected databases that are actually hosted in phonbank, but this doesn't explain everything:

In phonbank & absent from table: Goadrose, Paris, Yamaguchi, Tsay Not in phonbank & absent from table: Zhou3

Plus, some that appear are actually hosted in phonbank: In phonbank & present in table: Providence

smeylan commented 3 years ago

These were all fixed in the 2020.1 release, which added Providence. I believe Zhou3 was missing simply because it wasn't in the dataset at the time we downloaded it for 2019.1