souzatharsis / podcastfy

An Open Source Python alternative to NotebookLM's podcast feature: Transforming Multimodal Content into Captivating Multilingual Audio Conversations with GenAI
https://www.podcastfy.ai
Apache License 2.0
1.52k stars 166 forks source link

Feature Request: Streaming (Realtime Interaction) #183

Open taowang1993 opened 1 week ago

taowang1993 commented 1 week ago

Hi Podcastfy Team,

I want to integate podcastfy into Dify which is a low-code AI agent building platform.

There are two ways to integrate Podcastfy: 1) Integrate Podcastfy as a TTS provider. 2) Integrate Podcastfy as a tool.

If I go for option 1, then users would need to convert the text (generated by an LLM) to audio by clicking the play button.

However, this option is meant for streaming - realtime interaction.

I can't make users wait for several minutes for the conversion, because they would be expecting realtime interactions rather than an audio file.

image

If I go for option 2, then users would need to ask an LLM to use Podcastfy as a tool to convert the generated text into an audio file.

For example, in the following screen shot, I asked an LLM to use a tool called DuckDuckGo Video Search to search for TED-ED videos.

With this option, it makes sense for users to wait for the conversion, because they would be expecting a complete audio file.

image

With the current features supported by Podcastfy, I can only choose option 2.

However, I wonder if it is possible for Podcastfy to support streaming in the near future.

If possible, then users can have realtime interactions rather than having to wait for too long.

This will dramatically improve user experience.