-
## Goal
Experiment on WhisperVQ model for better result on multilingual. Hypothesis the current codebook is only 512 which is a small space to compress the multilingual capability.
## Learning Goa…
-
WhisperSpeech has a Medium model now:
https://huggingface.co/WhisperSpeech/WhisperSpeech
It might be more accurate.
-
Since ichigo v0.5 will support additional language that will make the traditional t2s obsolete. This is a good chance to introduce a t2s framework that we have full control over.
# What needed to be …
-
## Goal
Create a speech instruction finetuning to make Ichigo better in conversation.
## Tasklist
- [ ] Check the data generation pipeline: https://github.com/collabora/WhisperSpeech
- [ ] Expe…
-
In the Cosyvoice paper (https://arxiv.org/pdf/2407.05407), the authors mentioned that _To the best of our knowledge, this is the first attempt to involve supervised speech tokens into TTS models._ How…
-
# Goal
Replace existing TTS cascade with a speech decoder that directly generates speech. This change will replace the current TTS cascade which adds latency to ichigo's response time.
# Potential So…
-
### Feature request
Maybe I'm just overlooking it, but it would rock if it were possible to do TTS for more languages. English is well catered for with `T5`, but for other languages I have to fall …
-
I installed the program under fedora using Flatpak,and the AMD plugin (flatpak asked user or system, I picked system). It started out fine, I added a few languages, now it won't start (even with har…
-
Dear WhisperSpeech maintainers,
I found multi-language models like [s2a-v1.95-medium-7lang.model](https://huggingface.co/WhisperSpeech/WhisperSpeech/blob/main/s2a-v1.95-medium-7lang.model) on huggi…
oleid updated
3 weeks ago
-
# Overall
We can significantly improve the quality of synthetic multi-modal datasets by using **Flow Matching with Optimal Transport**.
# Context
Currently we make use of Autoregressive Model f…