Note how in the table below, the "time" column contains the text of the entire conversational turn. Then "Speaker" contains the time and "Speech" contains the speaker."
Time
Speaker
Speech
[00:33:33.25] Rebecca: and then my thinking at least, is you should be able to, um, say that "star p of i" /mmhmm/ equals, uh, the title, and then you just do i++, so then it’ll {{move to the next one}} !{makes looping gesture with left hand}[1] /OK/
[00:33:33.25]
Rebecca
[00:34:00.12] Rebecca: and you just keep {{saving each of the pointers}} !{left hand makes horizontal chops in the air, like rungs down a ladder}[2]
[00:34:00.12]
Rebecca
It may not be immediately obvious, but this is a result of how we're doing regex matching. We're actually getting an off-by-one error because of how the regex.exec() API works. From [the API documentation][1]:
If the match succeeds, the exec method returns an array and updates properties of the regular expression object. _The returned array has the matched text as the first item_, and then one item for each capturing parenthesis that matched containing the text that was captured.
So, our weird table actually results from an off-by-one error in [this code][2]:
The solution is to know that rawTurnComponents[0] will always contain the fully matched turn, and that the components start at rawTurnComponents[1] and above.
Note how in the table below, the "time" column contains the text of the entire conversational turn. Then "Speaker" contains the time and "Speech" contains the speaker."
It may not be immediately obvious, but this is a result of how we're doing regex matching. We're actually getting an off-by-one error because of how the
regex.exec()
API works. From [the API documentation][1]:So, our weird table actually results from an off-by-one error in [this code][2]: