langgenius / dify

Dify is an open-source LLM app development platform. Dify's intuitive interface combines AI workflow, RAG pipeline, agent capabilities, model management, observability features and more, letting you quickly go from prototype to production.
https://dify.ai
Other
53.49k stars 7.82k forks source link

filter the markdown image block for TTS input #9133

Open verigle opened 1 month ago

verigle commented 1 month ago

Self Checks

1. Is this request related to a challenge you're experiencing? Tell me about your story.

for tts, expect remove TTS input text of markdown image url .

2. Additional context or comments

llm output example

this is a new picture:
![picture1](http://www.test.com/test.jpg)

it is not need to use TTS to audio for image markdown block (![picture1](http://www.test.com/test.jpg)), is there has any way to remove image url for TTS input

3. Can you help us with this feature?

crazywoola commented 1 month ago

Not sure what you are trying to say, what do you expect?

verigle commented 1 month ago

I expect the tts don't read the markdown block of image url.

example for llm output:

this is a new picture:
![picture1](http://www.test.com/test.jpg)

for tts, it only required to read the word of this is a new picture, however, the current tts will also read the url of [picture1](http://www.test.com/test.jpg), which is not useful audio for user

verigle commented 1 month ago

is there any plan to filter the markdown image block for TTS input?

dosubot[bot] commented 1 week ago

Hi, @verigle. I'm Dosu, and I'm helping the Dify team manage their backlog. I'm marking this issue as stale.

Issue Summary

Next Steps

Thank you for your understanding and contribution!