Investigate the the inter-turn interval threshold for conversation turns evaluation

lucasgautheron commented 3 years ago

Is your feature request related to a problem? Please describe.

Evaluating conversational turn counts from speech labels, requires a time threshold to discard speaker switches that are too far apart to belong to a conversation. It is expected to be O(1 s), but the impact of the choice of the parameter should be evaluated, and a better value retained if it leads to a noticeable improvement

Describe the solution you'd like

find some ground truth data if possible : according to Alex Tsimane, Namibia and ACLEW have such ground truth, but I could not find high-level conversational information in tsimane/namibia human annotations. Are there other human annotations for these datasets, that I don't know of ? Or did you mean using lower-level information to infer turns, @alecristia ?
Alex also suggests LOO comparison of the parameters

This will help solve #169

alecristia commented 3 years ago

if you mean annotations where a judgment is made as to whether people are engaged in actual conversation (https://github.com/marisacasillas/chattr-basic#an-important-caveat), there isn't any high-level annotation for those datasets. I meant low-level rules that are applied to human annotations of who talks when; and also to automated annotations of who talks when.

One may think -- why would correlations vary as a function of ITI in such a case? I suspect because there are some non-obvious/overt parameters in the who talks when automated routine leading it to group or separate vocalizations by the same class, and/or to miss some vocalizations altogether. Optimizing a correlation coefficient (and/or an error rate) comparing human and automated annotation of 1- or 2-minute samples could help us answer: "which parameter should I use such that I can trust my automated analyses about as much as I'd trust a human who doesn't speak the language/understand what people are talking about?"

LoannPeurey commented 2 years ago

The conversation package aims at providing a way of evaluating conversations and extracting metrics from them (turn counts, duration etc.) ChildProject will use this package for everything related to conversations. https://github.com/LAAC-LSCP/conversations

LAAC-LSCP / ChildProject

Investigate the the inter-turn interval threshold for conversation turns evaluation #174