Open dataoracle opened 6 years ago
For traffic I seem to be getting text:
Type your request: What's the traffic like to work?
Assistant Response: On your way to work, traffic is light, as usual. It is twenty-eight minutes by car.
Conversation Complete
For jokes, it looks like I'm only getting the punchline maybe?
Type your request: Tell me a joke
Assistant Response: The king and queen of clubs 👑 ♣
Conversation Complete
What examples are you using that is causing you issues?
Hey @endoplasmic , thanks for your follow up.
We got intrigued by your traffic response coming properly, so we added the work location to the google account linked to the project and it is working properly as long as we phrase questions involving the work
tag.
Before we were trying with questions like How is the traffic around XXX street?
The audio answer comes ok, but the Assistant Response
is empty. Can you give it a try from your end?
The other cases are jokes/riddles. For these we are getting either just the punchline (as in your sample above) or the answer to the riddle. In both cases the audio response is coming properly (full joke or riddle).
What we are trying to do is use the google assistant for a conversational bot that needs to work on a "text-only" channel as well as in a "voice-only" channel. Voice seems that will not be the problem, but for the text channel these cases will not work properly.
Any ideas?
Thanks!
I'm seeing the same on my end regarding traffic. Blank text response. Sounds like a bug in the SDK.
If you wanted to you could always take the audio as it comes in and transcribe it via google speech. Start the request once you get the first bytes and you'll get the text streamed back to you.
Discussion started: https://plus.google.com/111323056508012159527/posts/7RP68D4WiVx
Ok, so it's consistent. We are exactly doing that, a STT using the google speech API, but it adds more latency and we have the problem of getting proper punctuation, that only works for en-US and not for en-UK at the moment.
I'll report back if there is any update or if we find a work around.
Thanks!
I raised this as a bug on the API a couple of months ago as it affects my Google Assistant for Alexa skill. https://github.com/googlesamples/assistant-sdk-python/issues/158
@tartanguru - Thanks for the link. I've subscribed to the thread to watch any action that comes up.
I check into this once and a while, and it does look like "tell me a joke" is fixed, but the traffic around a specific place is still busted.
@endoplasmic
Two ideas for this issue:
I am getting a lot of blank responses, even with screen: { isOn: false }
so I decided to change the isOn
flag to true to true
and then parse the HTML to see what was being sent. I used the html-to-text library to do this. The result is a bit mangled, but it is certainly more verbose than what is offered by the simple text response when isOn
is set to false
.
The other thing I looked into is the "debug" option referenced in the SDK. If the debug option is set to true (and some other conditions are met), then you are sent a full response object that contains the text that would have been converted to speech. Would you be able to allow this library to access the debuginfo flag? see https://developers.google.com/assistant/sdk/reference/rpc/google.assistant.embedded.v1alpha2#google.assistant.embedded.v1alpha2.DebugInfo
I added it in commit: https://github.com/endoplasmic/google-assistant/commit/641edf80cccd0c98db23331bfe607c1abcb447d2
I have no way to test it (or what conditions need to be met) since I don't have any Actions on Google things. It seems that's what the field is for though.
Either way, it's good to support it, so thanks for pointing that out!
Hi, great job guys on the implementation of the google assistant service! loving it :)
I'm interested on both text and audio responses. I realized that certain types of question are populating the text in the
response
event either partially or not at all.Examples I found so far:
response
event.response
event only reports the last message.I understand that the
supplemental_display_text
of theDialogStateOut
is not meant to be always the full transcript of the audio response, but I was wondering if there is something that we can do to get the full jokes as text.For those that are coming totally empty (like traffic related questions) I could use the google cloud speech API to do a STT of the
audio_data
, at the expense of some extra round time.Any ideas guys around these two use cases? Thanks!!