I am "talking" to the realtime API via g711_alaw audio streaming (with server_vad).
I noticed that the assistant is very difficult to correct. It insists on an initial assumption and is hardly willing to deviate from it. I am talking in German, assistant talks in German. I translated the following examples for you.
Examples:
I am introducing myself with "Hi my name is Mark, how are you?"
A: "Hi Marco, I am fine."
Me: "No, I am Mark."
A: "Ok, hi Marco."
Me: "My name is Mark."
A: "Of course, Marco."
Me: "Please correct yourself: my name is Mark"
A: "Oh, of course. Marco."
I also noticed that when the assistant tries to end the conversation, it is no longer possible to get him away from this:
A: "Feel free to call me again. Bye Bye"
Me: "No, I don't want to end the call."
A: "Sure. Bye Bye and have a great day."
Me: "No, please answer more questions."
A: "It was nice talking to you. Goodbye."
Is this a known behaviour? Can this behaviour be adjusted somehow?
I am "talking" to the realtime API via g711_alaw audio streaming (with server_vad).
I noticed that the assistant is very difficult to correct. It insists on an initial assumption and is hardly willing to deviate from it. I am talking in German, assistant talks in German. I translated the following examples for you.
Examples: I am introducing myself with "Hi my name is Mark, how are you?" A: "Hi Marco, I am fine." Me: "No, I am Mark." A: "Ok, hi Marco." Me: "My name is Mark." A: "Of course, Marco." Me: "Please correct yourself: my name is Mark" A: "Oh, of course. Marco."
I also noticed that when the assistant tries to end the conversation, it is no longer possible to get him away from this:
A: "Feel free to call me again. Bye Bye" Me: "No, I don't want to end the call." A: "Sure. Bye Bye and have a great day." Me: "No, please answer more questions." A: "It was nice talking to you. Goodbye."
Is this a known behaviour? Can this behaviour be adjusted somehow?