I am trying to get the utterances from the target child Ross in the MacWhinney corpus (collection Eng-NA) in the age range from 3-4 years, corresponding to the files 30001a (age 3:00,01) to 41125d (age 4;11,25). In the CHILDES database (browsable files for MacWhinney: https://sla.talkbank.org/TBB/childes/Eng-NA/MacWhinney), the files are named after Ross's age at the time of production, so I can compare what I should get from get_utterances with what I actually get.
Here's what I am doing:
utt_ross <- get_utterances(corpus = "MacWhinney", role = c("Target_child"), target_child = "Ross", age = c(36, 59))
This yields a tibble with 3603 obs. of 27 variables.
In order to check if the command got everything it should, I am looking for the utterances in the first and last files that I wanted to get.
According to this method, (at least) the following utterances are lacking from the results:
utterances from 30001a (age 3;00,01) - even though utterances from 30001b (also age 3;00,01) are found
utterances from 040404a (age 4;04,04) to 041125d (age 4;11,25)
I am not a 100% sure if I am checking the results correctly (there probably are better methods), but even considering that age is converted into days and back into months in the get_contents function, something seems to be going wrong when utterances from 7 months are missing in my end results.
The max age that I get (computed by the function) is 52.13249 (instead of something like 59.8 which I should be getting for age 4;11,25).
Sorry in advance if this should be an error on my side!
I am trying to get the utterances from the target child Ross in the MacWhinney corpus (collection Eng-NA) in the age range from 3-4 years, corresponding to the files 30001a (age 3:00,01) to 41125d (age 4;11,25). In the CHILDES database (browsable files for MacWhinney: https://sla.talkbank.org/TBB/childes/Eng-NA/MacWhinney), the files are named after Ross's age at the time of production, so I can compare what I should get from get_utterances with what I actually get.
Here's what I am doing: utt_ross <- get_utterances(corpus = "MacWhinney", role = c("Target_child"), target_child = "Ross", age = c(36, 59)) This yields a tibble with 3603 obs. of 27 variables.
In order to check if the command got everything it should, I am looking for the utterances in the first and last files that I wanted to get. According to this method, (at least) the following utterances are lacking from the results:
I am not a 100% sure if I am checking the results correctly (there probably are better methods), but even considering that age is converted into days and back into months in the get_contents function, something seems to be going wrong when utterances from 7 months are missing in my end results. The max age that I get (computed by the function) is 52.13249 (instead of something like 59.8 which I should be getting for age 4;11,25).
Sorry in advance if this should be an error on my side!