speech-language-model Search Results

1000+ results
for speech-language-model

Best match

Best match Most commented Newest Recently updated Least commented Oldest Least recently updated

espnet/espnet #5774

Multilingual ASR with Auxiliary CTC objectives

In the paper entitled: "Improving Massively Multilingual ASR with Auxiliary CTC Objectives" you proposed a method where early encoder layers where conditioned to identify the language of speech, while…

david-gimeno updated 5 months ago
10
microsoft/NeuralSpeech #128

[Prompt TTS2] About the SLU part and speech attribute classi…

ZZDoog updated 5 months ago
1
KdaiP/StableTTS #4

Congratulations on the repo!

Hey hey @KdaiP, Thanks for open-sourcing your implementation. I'm VB, I work in the open source audio team at Hugging Face. I'd love to know more and see how we can potentially help you with your e…

Vaibhavs10 updated 2 months ago
1
huggingface/transformers #17339

Google's Trillson Audio Classification

### Model description The TRILLsson models are described in the publication TRILLsson: Distilling Universal Paralingistic Speech Representations. From audio, they generate generally-useful paralingui…

patrickvonplaten updated 2 years ago
18
intel-analytics/ipex-llm #11698

bark model on intel gpu takes 60 seconds

hello i am attempting to create text to speech with bark on intel a770 but it takes around 60 seconds to generate audio is that normal ? is there a way to make it faster like few seconds ? https://git…

SlyRebula updated 3 months ago
2
elevenlabs/elevenlabs-python #124

confusion over the API and model ID options

I'm trying to use the elevenlabs python library with stream(), and it works fine with eleven_monolingual_v1 but fails with eleven_monolingual_v2. However I can't find anything in the documentation th…

dskill updated 8 months ago
4
cis3296f24/applebaum-final-project-section-005-applebaum #8

SpeakSense: A Speech Pattern Analyzer

![image](https://github.com/user-attachments/assets/fda027e3-f1c9-4bc8-b7d3-af5fee31cb97) Section 1, Speech-Analysis, Word Frequency Tracking, Taser, Java/Kotlin, Android app, Speech Pattern Analysis…

JRheeTU updated 2 weeks ago
6
KoljaB/RealtimeTTS #107

Is there any way to convert the input text received by the r…

Hi, I want to know is there anyway to change the input text to a text stream to fit the language model's output. threading.Thread( target=play_text_to_speech, …

worker121 updated 4 months ago
5
dynamic-superb/dynamic-superb #186

[Task] Japanese Pitch Accent Word Recognition

# Task Name Japanese Pitch Accent Word Recognition ## Task Objective This task aims to recognize words in Japanese audio that have different meanings based on pitch accent. Japanese pitch accent …

kenchung285 updated 4 months ago
3
yl4579/StyleTTS2 #286

Trained StyleTTS2 for Hindi but didn't get good results

We have trained StyleTTS2 model for Hindi language. Initially we trained PL-bert for Hindi considering Espeak phonemizer and Indicbert tokenizer. Then we utilized that newly trained Hindi PLbert by re…

SandyPanda-MLDL updated 1 month ago
7

上一页 1...23 24 25 26 27 28 29...100 下一页

1000+ results for speech-language-model

1000+ results
for speech-language-model