RoboTutorLLC / RoboTutor_2019

Main code for RoboTutor. Uploaded 11/20/2018 to XPRIZE from RoboTutorLLC/RoboTutor.
Other
7 stars 4 forks source link

Extend BubblePop to tap on audio responses #47

Open JackMostow opened 7 years ago

JackMostow commented 7 years ago

From @JackMostow on February 19, 2017 20:10

I.e. each bubble speaks its prompt while doing something visual to associate itself with the audio, e.g. expand, vibrate, and/or change saturation or intensity. Will enable multiple choice activities that map print to sound.

Copied from original issue: synaptek/RoboTutor#290

JackMostow commented 6 years ago

Use VSS.

octavpo commented 6 years ago

What's VSS?

JackMostow commented 6 years ago

Sorry for the TLA (). VSS = Visual Speech Synthesis -- which turned out not to work well enough to use. The only visual speech that's feasible to use is the talking-mouth video clips of syllables. But the more general solution, as in Derek's prototype, is simply to play the audio for each target when it appears. ( Three Letter Acronym ;-)

octavpo commented 6 years ago

From the description above it's still not very clear what the task is about. When are the bubbles supposed to speak their prompts? When they show up on the screen? Just once or keep repeating? Or when they're tapped? There is currently sound and visual feedback on tap, so is this just a way to customize that?

It would be good if somebody could include previous conversations/decisions on this topic here. Thanks.

JackMostow commented 6 years ago

Good questions. There's doubtless prior discussion somewhere but it's easier to summarize the upshot: Bubbles should speak their prompts once when they appear, and again when tapped. Their primary purpose is to allow tasks that map a stimulus to a spoken response -- i.e. provide a multiple choice version of an oral response, but more reliably assessable than speech recognition. Does it appear feasible to include in code drop 1?

octavpo commented 6 years ago

Seems doubtful. I still need to finish the other task, so it will be only a couple of days before the code freeze. And if I understand correctly this is a task Derek was estimating it would take him a couple of weeks to finish, and he already knows what needs to be done. And to top things off I need to show for jury duty tomorrow. :)

octavpo commented 6 years ago

Still lots of details to figure out, a few more questions that come to my mind, maybe you guys have already figured them out:

Tapping on a correct bubble pops it. For an audio target, play the audio first, then pop the bubble with its usual sound effect and animation.

Tapping on an incorrect bubble spotlights the tapped bubble and gives corrective feedback. For a wrong audio target, first play the sound effect for wrong answers: {"type": "AUDIO", "command": "PLAY", "soundsource": "wrong.mp3", "soundpackage":"tutor_effect", "volume": 0.05, "mode":"event", "features": ""},

Then play the audio, e.g. "This is DOG."

Then spotlight the visual stimulus (if any) and repeat the stimulus prompt, e.g. "Tap on the word that starts like CAP."

However, this is an unfortunate example because the expression activity relies on Java code to parse the expression, e.g. "2+2" into the three variables. WRITE's use of the data source is a much better example because it simply lets the data source set them.

JackMostow commented 6 years ago

Octav - If audio targets work, please (if you haven't already done so):

  1. Incorporate them into RoboTutor.

  2. You and I agreed that data sources should specify "show" and/or "say" for targets just as they do for stimuli. Did you implement them that way? If so, where is the necessary syntax documented?

  3. Ask Judith who should audio-enable the responses by modifying the phonemic activity data source generator and regenerating them or modifying them directly, e.g.
    RoboTutor/app/src/main/assets/tutors/bubble_pop/assets/data/sw/bpop.wrd_beg.wrd.ha.show.1.json.

  4. Point us to a working .apk demo.

Thanks. - Jack

octavpo commented 6 years ago

This should be finally ready now. The changes are on branch audio_targets. Data sources indicate show and say for audio targets by adding properties "target_show" and "target_say" to the data source files, similar to "question_show" and "question_say".

JackMostow commented 6 years ago

Are the properties what to say and show, or whether to do it?

octavpo commented 6 years ago

These properties decide whether to say or to show the targets, same as for the stimuli. What to say is decided the same as when a bubble is tapped.

octavpo commented 6 years ago

Please explain what the hesitation part is. "Implemented hesitation" is far from self-explanatory.

I thought it's better to keep discussions here. The hesitation is what we discussed at some point, that the prompts are repeated periodically if kids don't choose an answer. The period now is set to 6s, just because that's how it was set in some other tutor that was implementing it. It can be easily changed of course.

JackMostow commented 6 years ago

@judithodili - The audio targets capability enables bpop activities to map a visual and/or spoken stimulus to a spoken target (optionally with a text label as well). This gives us a multiple choice alternative to oral responses for assessment.