Open mhar-andal opened 1 month ago
i would love this! this could be a great way to build a new copilot style assistants with such a feature.
@aconchillo/ @kwindla any plans of making this work with pipecat?
This is now possible in 0.0.48. All you need to do is:
@transport.event_handler("on_first_participant_joined")
async def on_first_participant_joined(transport, participant):
await transport.update_subscriptions(
participant_settings={participant["id"]: {"media": {"screenVideo": "subscribed"}}}
)
await transport.capture_participant_video(
participant["id"], framerate=0, video_source="screenVideo"
)
And it's actually pretty cool. You can try it in this example https://github.com/pipecat-ai/pipecat/blob/main/examples/foundational/12b-describe-video-gpt-4o.py and see how it describes your screen.
You just need to open the Daily Room URL in your browser and share your screen. Feel free to close the issue if it works for you.
@aconchillo This is awesome! Although it seems to rate limit my chat completions requests.
INFO:httpx:HTTP Request: POST https://api.openai.com/v1/chat/completions "HTTP/1.1 429 Too Many Requests" INFO:openai._base_client:Retrying request to /chat/completions in 0.494679
The 0
framerate param doesn't send any images to openai, but increasing it to 1
makes it work for a minute or so until it openai rate limits. Any ideas on how to make this work with openai in a long session?
Also, does this work with the openai realtime API?
Would be nice if you could capture the participants screenshare and feed the context to the LLM.