-
## Feature Name
Speechify
## Feature Description
## Overview of Speechify
**Speechify** is a leading text-to-speech (TTS) platform designed to convert written text into natural-sounding sp…
-
Splitting this out from #2
-
This needs some research/discussion:
The primary issue with finding a suitable speech to text api is that the majority require that the user interact with them via a web browser that uses the [Web …
-
Great work with this tool I have been testing it and works great. I appreciate your work thank you! Flowise just released multi-modal which allowed us to speech to text and also image-to-text. Speech-…
-
### Self Checks
- [X] I have searched for existing issues [search for existing issues](https://github.com/langgenius/dify/issues), including closed ones.
- [X] I confirm that I am using English to su…
-
Some task methods in the `huggingface_hub.InferenceClient` do not include a `parameters` argument to allow passing additional inference params.
The tasks are : `audio-classification`, `automatic-spe…
-
----
## 🚀 Feature
Support websocket endpoints to allow two-way real-time data communication.
### Motivation
Currently, the requests are processed with the expectation that the …
-
**Describe the bug**
Python Implementation of TTS Avatar is not working as expected
**To Reproduce**
Steps to reproduce the behavior:
1. Followed the exact steps mentioned in the readme of t…
-
root@ddd:~/omnivore# docker compose up
[+] Building 388.7s (41/53)
=> CACHED [migrate 6/10] COPY /packages/db/package.json ./packages/db/package.json …
-
### Request Description
Moshi is a speech-text foundation model and full-duplex spoken dialogue framework. It uses Mimi, a state-of-the-art streaming neural audio codec. Mimi processes 24 kHz audio, …