pipecat-ai / pipecat

Open Source framework for voice and multimodal conversational AI
BSD 2-Clause "Simplified" License
2.9k stars 217 forks source link

example storytelling-chatbot not starting input user #143

Closed janwout closed 3 months ago

janwout commented 3 months ago

After following the readme ode the storytelling chatbot example, I see the narrator is starting the story. But the fragment if the user input is not started. The microphone does not become active.

I see in the readme, Deepgram mentioned. But I can not find it being used in the code. I want to help improving the example, but I am stuck. Is daily.co supposed to stream/transcribe user input?

Someone knows how to take next step?

jptaylor commented 3 months ago

Hi @janwout. This may be related to the fact that Daily requires a credit card / billing information be entered before transcription becomes enabled on your domain. Are you still using the free tier?

Edit: Sorry, I was mistaken. Daily offers 2k real-time transcription credits to free tier users.

You could definitely implement your own transcription layer if you wanted (such as using Deepgram).

I think it's a good idea if we update this example to show the different methods for doing so; please bare with us!

janwout commented 3 months ago

@jptaylor thanks for the response. I entered the creditcard information. But no difference after this.

I do not see any logging either. It just waits probably on input?

The microphone color does not light up either. Which does in the online deployment with the sound to start input at https://storytelling-chatbot.fly.dev/

You got an idea where to look next after reading this?

jptaylor commented 3 months ago

Hmm. Are you running locally? If so, do you see a console error indicating that navigator is not defined? I believe by default, the demo hosts at 0.0.0.0, which if you run locally, will likely run over http (thus blocking device access.)

You can fix this here: https://github.com/pipecat-ai/pipecat/blob/main/examples/storytelling-chatbot/src/server.py#L156 and change to localhost (something we'll likely update in a PR soon.)

It may also be that you have not authorized access to to your media devices. Does the green bar react to your voice, during the second step when you load the app?

janwout commented 3 months ago

yeah I am running locally, followed readme and filled the api keys

it asks for microphone access and it says it is allowed. Firefox and Chrome.

I tried localhost in https://github.com/pipecat-ai/pipecat/blob/main/examples/storytelling-chatbot/src/server.py#L156 same behaviour.

no console error in the browser about navigator

I hear the sound that you can start to speak, but de microphone image in the screen does not turn green.

I see this warning

WARNING! Invalid setParameters call detected! The good news? Firefox supports sendEncodings in addTransceiver now, so we ask that you switch over to using the parameters code you use for other browsers. Thank you for your patience and support. The specific error was: Cannot change the number of encodings with setParameters

and this error

WebRTC: Using five or more STUN/TURN servers slows down discovery
jptaylor commented 3 months ago

@janwout Let me give this a proper test on Firefox and I'll get back to you. Sorry for the trouble!

Just to confirm whilst I do that - you do see a green bar on the device check screen at the start, when you talk into the microphone?

janwout commented 3 months ago

@jptaylor yes, a green bar when making noise in the check page.

Thanks for trying to reproduce!

I use the latest main branch

janwout commented 3 months ago

Tried a lot, server on localhost, run on https.

Seems that the pipeline is going stale or something. After the sound, for requesting user input after thes messages nothing ever happens:

2024-05-17 09:29:11.721 | DEBUG    | pipecat.services.openai:_stream_chat_completions:82 - OpenAI LLM TTFB: 1.7685613632202148
2024-05-17 09:29:12.417 | DEBUG    | pipecat.services.elevenlabs:run_tts:35 - Transcribing text: Hello!
2024-05-17 09:29:14.140 | DEBUG    | pipecat.services.elevenlabs:run_tts:35 - Transcribing text:  I'm excited to create a magical story just for you.
2024-05-17 09:29:14.703 | DEBUG    | pipecat.services.elevenlabs:run_tts:35 - Transcribing text:  What kind of story are you in the mood for today?
2024-05-17 09:29:15.280 | DEBUG    | pipecat.pipeline.runner:run:36 - Runner PipelineRunner#0 finished running PipelineTask#0
2024-05-17 09:29:15.280 | DEBUG    | pipecat.processors.frame_processor:link:37 - Linking PipelineSource#1 -> DailyInputTransport#0
2024-05-17 09:29:15.280 | DEBUG    | pipecat.processors.frame_processor:link:37 - Linking DailyInputTransport#0 -> LLMUserResponseAggregator#0
2024-05-17 09:29:15.280 | DEBUG    | pipecat.processors.frame_processor:link:37 - Linking LLMUserResponseAggregator#0 -> OpenAILLMService#0
2024-05-17 09:29:15.280 | DEBUG    | pipecat.processors.frame_processor:link:37 - Linking OpenAILLMService#0 -> StoryProcessor#0
2024-05-17 09:29:15.280 | DEBUG    | pipecat.processors.frame_processor:link:37 - Linking StoryProcessor#0 -> StoryImageProcessor#0
2024-05-17 09:29:15.280 | DEBUG    | pipecat.processors.frame_processor:link:37 - Linking StoryImageProcessor#0 -> ElevenLabsTTSService#0
2024-05-17 09:29:15.280 | DEBUG    | pipecat.processors.frame_processor:link:37 - Linking ElevenLabsTTSService#0 -> LLMAssistantResponseAggregator#0
2024-05-17 09:29:15.280 | DEBUG    | pipecat.processors.frame_processor:link:37 - Linking LLMAssistantResponseAggregator#0 -> DailyOutputTransport#0
2024-05-17 09:29:15.280 | DEBUG    | pipecat.processors.frame_processor:link:37 - Linking DailyOutputTransport#0 -> PipelineSink#1
2024-05-17 09:29:15.280 | DEBUG    | pipecat.processors.frame_processor:link:37 - Linking Source#1 -> Pipeline#1
2024-05-17 09:29:15.281 | DEBUG    | pipecat.pipeline.runner:run:30 - Runner PipelineRunner#0 started running PipelineTask#1
janwout commented 3 months ago

I see in firefox the microphone turn red on device page

image

But when sound for user input is played, the microphone icon stays gray.

image

janwout commented 3 months ago

solved with v0.0.18