Open octavpo opened 6 years ago
From another of Jack's emails:
Previous attempts to measure reading performance based on tapping for help ran into two difficulties:
After some digging I found that the previous attempt was failing because it was trying to get context information from a place where it wasn't available. We can fix that. I didn't see anything in the previous attempt that was going further than just try to display the current word and the correct/incorrect status. If we want more, I need more details about when and what we want to log. The log record structure is the following;
Name | Type | Description |
---|---|---|
timestamp | Long | timestamp |
userId | UUID | one per each student: chosen by FaceLogin each session |
sessionID | UUID | one per each session: generated by FaceLogin each session |
gameId | UUID | one per each new game started: generated by RoboTutor |
language | String | tutor language |
tutorName | String | name of the tutor e.g. "add_subtract" |
levelName | String | name of level e.g. "asm_26" |
taskName | String | name of task as described in data source e.g. "count by ten" |
problemNumber | Int | incremented number within a game e.g. 1,2,3,4,5 |
problemName | String | Generated based on rules for each Tutor type e.g. 2+3=5 |
totalSubSteps | int | total number of steps in a problem |
substepNumber | Int | the step within a multi-step problem e.g. 1,2,3 |
substepProblem | int | |
attemptNumber | in | attempt count |
expectedAnswer | String | the expected answer from the student |
userResponse | String | the actual answer given from the student |
correctness | String | CORRECT or INCORRECT |
distractors | String | |
scaffolding | String | |
promptType | String | |
feedbackType | String |
A few fields need a description from Kevin.
At this point I have software that sends a performance log message with the structure above after each word event, both when listening and when speaking (just because that's what the original code was trying to do, not sure we need those). So it has the expected word, the recognized word, the attempt count, correctness status, and all the context information (it just uses "WORD" for level name). How do we go on:
Excellent! The (Swahili) Listener and the ASR itself (PocketSphinx) operate in 2 different spaces. "Text space" is the sequence of text words in the sentence, represented as either
a string of sentence text including punctuation
a sequence of text words minus punctuation immediately before or after the word but including word-internal punctuation such as -, ', and . in an acronym -- e.g. U.S.A. becomes U.S.A because the final period is post-punctuation.
"Speech space" is a sequence of word tokens output by the ASR, which may have repetitions, omissions, noise symbols, and substitutions of the form START_word, and may also have parenthesized identifiers to distinguish alternative pronunciations of the same word. Alternative pronunciations are much less of an issue for Swahili than for English because Swahili is phonetic, but a START_word may have a separate alternative pronunciation for each truncation of the word.
For userResponse, can you log the actual ASR output, a sequence of 0 or more speech space words?
For expectedAnswer, please log the unpunctuated word passed to the (Swahili) Listener, because it is the lexical knowledge component we eventually want to trace using knowledge tracing, along with KCs at the syllable and phoneme levels. But the KCs can be defined after the fact.
The correctness field will tell whether the text word was accepted as read correctly, which is not simply whether userResponse = expectedAnswer.
Thanks!
I have a new version that has some improvements over the first one, although it might not address the note above. So what it does different from the first version is that it detects whether a match happened because of a "virtual" word inserted as help by the tutor vs a "real" word returned by the listener. So for a word generated after a touch it would put TOUCH_GENERATED in userResponse, while for a word generated after two mistakes it would put AUTO_GENERATED in userResponse. In the latter case this comes after it shows the second wrong attempt for the word that was actually recognized, so that's not lost. And actually that's the heuristic it uses to decide between the two, because the program actually runs the same code in both cases.
expectedAnswer is the string that's compared against the listener result, so it's unpunctuated. With the current userResponse correctness is indeed userResponse = expectedAnswer, except for the two tutor generated cases above.
Regarding the idea above about the speech space, I did some digging and things are like this. Before sending a sequence of words to the tutor, the listener has a process of lowest-cost alignment between its sequence of words and the target sequence, during which most of those extras are eliminated. So for instance a "START_word" is only kept if there's no "word" in its sequence, otherwise it's eliminated. And only the cleaned up list is sent to the tutor (where the performance tracing is taking place).
So if we want to record that original sequence, we have to pass it to the tutor too, which is not hard. But I'd suggest if we do that we put it in a different field rather than in userResponse, so we can still have in userResponse the word that the comparison was done against, it might not be easy to guess. We could put the sequence in "distractors" for instance, even if it's not quite that, or please let me know if you have a better idea.
Now I also wonder if those TOUCH_GENERATED and AUTO_GENERATED labels should go in feedbackType rather than userResponse. It would be helpful if Kevin documented those fields.
PS Is it possible to add Evelyn to GitHub so she can read these comments?
Octav, I merged your changes into development. I'm not sure if you were completed with your _story_readinglogging branch because you did not open a Pull Request, but I merged them anyways because we are pushing code to Mugeta tonight. If you need to make more changes, please start a new branch off of development so code is current.
Octav - A separate ASR_output field seems clearest.
From Jack's email:
Can you figure out how to get RoboTutor to log: