microsoft / semantic-kernel

Integrate cutting-edge LLM technology quickly and easily into your apps
https://aka.ms/semantic-kernel
MIT License
21.93k stars 3.27k forks source link

.Net: Does semantic-kernel support gpt-4o-realtime-preview model currently? #9310

Closed sophialagerkranspandey closed 1 day ago

sophialagerkranspandey commented 3 weeks ago

https://github.com/microsoft/semantic-kernel/discussions/9293

Does Semantic Kernel currently support the GPT-4o-realtime-preview model that was released two weeks ago?

jerry2007 commented 3 weeks ago

Issue already exists... #9075

markwallace-microsoft commented 3 weeks ago

We're going to use this task to investigate creating a sample showing how to do this using the underlying SDK in conjunction with SK

jerry2007 commented 2 days ago

Hi, @dmytrostruk @markwallace-microsoft @RogerBarreto I really want this feature so I tried to do it myself. I found out that it should be possible to implement realtime api with SK but a lot of classes is only internal. So implement it just with use of RealtimeClient and SK nuget is not possible. But I tried to add abstraction to SK. Here is commit with POC: https://github.com/jerry2007/semantic-kernel/commit/b80b1d127bd4673cb8adf3b68f1b4a59290cfdfa

with this I was able to connect to realtime api and in basic form send message and use native plugin created by SK.

Is it ok for you? Can I help you with this somehow?

dmytrostruk commented 2 days ago

Hi @jerry2007 , thanks a lot for the provided information.

First of all, your commit contains your API key to Azure OpenAI service, so I highly recommend removing it from the codebase and regenerate a key, so the existing one becomes invalid.

Regarding the code, the scope of this task is to add an example how to work with gpt-4o-realtime-preview model using existing SK capabilities and I'm currently working on that. If we will decide to add new abstractions for Realtime API (I believe it makes sense to do that as soon as other AI providers will provide similar feature, so we can design an abstraction taking into account the capabilities from different providers), it will be done in a scope of separate task.

Most probably we will start from creating an Architectural Decision Record (ADR), which will describe different options for possible abstraction, pros and cons for each approach and so on. When this document will be created, you are very welcome to contribute as well, provide your ideas, comments and so on.

Thanks a lot!

jerry2007 commented 2 days ago

Ok, thanks for warning. I going to remove it.

I is it neccesery to wait for another provider? Is possible to create ADR earlier?

dmytrostruk commented 2 days ago

@jerry2007

is it neccesery to wait for another provider?

I don't think it's necessary, but I think building with multiple Realtime API providers will lead to better abstraction. Of course, if we see that such functionality won't be supported by other AI providers in the nearest future and we will have high demand for this functionality in SK, we can start building it using OpenAI as one implementation only.

Is possible to create ADR earlier?

I think it depends on team's prioritization. cc: @markwallace-microsoft