Closed ChrisFeldmeier closed 3 months ago
Does anyone have an idea?
Hey, for now you could use the publish_data API to forward these states to the client. see https://docs.livekit.io/realtime/client/data-messages/
Okey, can you give me a short example? How can I send the user speaking and agent speaking events to the client from the agent? I understand the LocalParticipant.publishData methode on the client, but I think it's the server agent which should receive the event.. or the event should send from the agent to the client... hmm?
import asyncio import logging from livekit.agents import JobContext, JobRequest, WorkerOptions, cli, tokenize, tts from livekit.agents.llm import ( ChatContext, ChatMessage, ChatRole, ) from livekit.agents.voice_assistant import VoiceAssistant from livekit.plugins import deepgram, openai, silero, elevenlabs from dataclasses import dataclass
@dataclass class VoiceSettings: stability: float # [0.0 - 1.0] similarity_boost: float # [0.0 - 1.0] style: float | None = None # [0.0 - 1.0] use_speaker_boost: bool | None = False
@dataclass class Voice: id: str name: str category: str settings: VoiceSettings | None = None
async def entrypoint(ctx: JobContext): initial_ctx = ChatContext( messages=[ ChatMessage( role=ChatRole.SYSTEM, text="", ) ] )
openai_tts = tts.StreamAdapter(
tts=openai.TTS(voice="shimmer", language="de"), #alloy -- für language auch tts abändern wie local in tts.py, alloy, echo, fable, onyx, nova, and shimmer
sentence_tokenizer=tokenize.basic.SentenceTokenizer(),
)
VOICE = Voice(
id="9yD3PafDQ5YI0CMGS3cO",
name="",
category="custom",
settings=VoiceSettings(
stability=0.53, similarity_boost=0.71, style=0, use_speaker_boost=True
),
)
initPrompt = "Wie kann ich dir heute helfen?"
assistant = VoiceAssistant(
vad=silero.VAD(),
stt=deepgram.STT(language="de-DE"),
llm=openai.LLM(model="gpt-4o"),
tts=elevenlabs.TTS(voice=VOICE,language="de"), #openai_tts
chat_ctx=initial_ctx,
)
assistant.start(ctx.room)
await asyncio.sleep(1)
await assistant.say(initPrompt, allow_interruptions=True)
async def request_fnc(req: JobRequest) -> None: logging.info("received request %s", req) await req.accept(entrypoint)
if name == "main": cli.run_app(WorkerOptions(request_fnc))
How can I call this functions on the playground app? On the agent side it works very well, but I can not access at the agents client? What can I do, I need it urgent :-/ .. or what do I have to change get it done?
I want to fill the <AgentMultibandAudioVisualizer with the state for
type VisualizerState = "listening" | "idle" | "speaking" | "thinking";
https://github.com/livekit/agents/blob/39a59595c870d8822fdbf4e271b352b0521a573a/livekit-agents/livekit/agents/voice_assistant/assistant.py#L90