This issue will be a place for discussion and collecting resources for using audio features for robot interaction adaptation

It is one goal of the project to use audio data from the participant reading various texts in some meaningful way to cause the robot to adapt the interaction. The first part of this is to understand what features we can sense and from there we can think about what we can adapt based on this information.

Sensing

Possible features

Their accuracy in speaking the words
- Perhaps compare a transcription of them through Amazon Polly to the text they're reading
- It will be hard to do this on a phoneme level outside of a transcription service because language is complex and it will likely be challenging to find a similar dataset
Their reading speed
- We could find their reading rate in words per minute by starting and stoping the interaction - we could also see if they read most of the text by looking at the transcript of the recording
Their reading confidence
- Measured by features like how shakey is their voice, etc.

Possible tools

openSmile

Adaptation

Possible ways of adapting the interaction

Giving simple performance feedback
- For example, "You are reading 25% faster than last week."
Giving advice
- For example, "Maybe you can try slowing down to make sure that you're reading clearly."

robotpt / vision-project

Figure out how we will use audio features for robot interaction adaptation #20

Sensing

Possible features

Possible tools

Adaptation

Possible ways of adapting the interaction