livekit / agents-js

Build realtime multimodal AI agents with Node.js
https://docs.livekit.io/agents
Apache License 2.0
113 stars 12 forks source link

Plug RealtimeModel+RealtimeSession into OmniAssistant, get it talking #72

Closed bcherry closed 3 weeks ago

bcherry commented 3 weeks ago

This handles getting it talking through the new path. Transcripts are still stubbed and functions are broken and some other mess is about.

changeset-bot[bot] commented 3 weeks ago

⚠️ No Changeset found

Latest commit: 0238dffc2cdcebb7c28a21f007ec9ddb2e6dff3e

Merging this PR will not cause a version bump for any packages. If these changes should not result in a new version, you're good to go. If these changes should result in a version bump, you need to add a changeset.

This PR includes no changesets When changesets are added to this PR, you'll see the packages that this PR includes changesets for and the associated semver types

Click here to learn what changesets are, and how to add one.

Click here if you're a maintainer who wants to add a changeset to this PR

bcherry commented 3 weeks ago

are we moving omni_assistant to openai without having a baseclass in normal agents? this isn't the way i saw python agents do it (though it makes a bit of sense with the cyclic dependencies)

yeah i saw there's a cylic dependency anyways so i decided to defer figuring that part out until later

bcherry commented 3 weeks ago

Updated to plumb everything through. agent transcripts are working but user transcripts are not - i'm going to fix that then merge it.

bcherry commented 3 weeks ago

It turns out the initial session update wasn't getting sent at all, and even trying to send it got rejected because it had the wrong name for max_response_output_tokens. Without the update, transcription was disabled.

I fixed this and also implemented a queue for client events so the event can be enqueued immediately and sent when the socket opens.

All good, except function calling and a million lines of cleanup to do...