microsoft SpeechT5 issues

microsoft / SpeechT5

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

MIT License

1.21k stars 114 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Are there any performance optimization for inference available for this model or faster inference versions or streaming I only find this example?

#93 lukaLLM opened 4 weeks ago
0
Speech conversion: process whole input without stopping

#92 holmbuar closed 1 month ago
1
gaokao_audio can not be download? something error

#91 liyunlongaaa opened 2 months ago
1
Why can WavLLM understand audio sounds as well?

#90 BenoitWang opened 2 months ago
1
Setup Error about WavLLM

#89 XqZeppelinhead0702 opened 2 months ago
4
Request for Assistance with VATLM Implementation: Accessing Wave2Vec Model File

#88 yeonju7kim opened 2 months ago
0
I found minor typo in Readme

#87 yeonju7kim opened 2 months ago
0
Please fix the broken download link!!! So many models cann't be used without checkpoint.

#86 world1tree opened 3 months ago
0
How to fine-tune SpeechT5 HifiGAN vocoder?

#85 yukiarimo opened 3 months ago
0
soundfile.LibsndfileError: <exception str() failed>

#84 ciwei6107563 closed 3 months ago
0
Unable to Download wavLLM Due to Error

#83 minkyu119 opened 3 months ago
1
What languages are supported? How to specify a language?

#82 secsilm opened 5 months ago
0
SpeechUT does not have a link for download

#81 world1tree opened 5 months ago
2
What's the model_path and data_name on inference code?

#80 YepJin opened 6 months ago
3
Confusion/Question about SpeechT5SpeechDecoderPostnet output

#79 Student204161 opened 6 months ago
0
Error in loading WavLLM model

#78 rishabh004-ai opened 7 months ago
9
Single Task Training

#77 yangjiabupt closed 7 months ago
1
WavLLM checkpoint

#76 ming024 opened 7 months ago
5
ASR fine-tuning loss goes to zero after several epochs

#75 yunigma closed 7 months ago
2
extract transorformer layer feature

#74 zbpjlc opened 8 months ago
2
Does the pre-trained model for hidden unit tokenizer use speaker embeddings?

#73 Kodhandarama opened 9 months ago
0
What is the time taken to converge for the hidden unit tokenizer?

#72 Kodhandarama opened 9 months ago
0
Link to train_960.tsv is broken

#71 Kodhandarama opened 9 months ago
0
"SpeechT5" on Android OS

#70 taeyeonlee opened 10 months ago
0
British English TTS model

#69 omega3 closed 7 months ago
1
Text feature extraction using SpeechLM

#68 wonjune-kang opened 10 months ago
0
Baseline implementation

#67 ussenuk opened 12 months ago
1
How to setting language when do S2T

#66 nhha1602 opened 1 year ago
1
是否支持中文转语音？

#65 xxm1668 opened 1 year ago
4
The size of tensor a (674) must match the size of tensor b (600) at non-singleton dimension 1

#64 poojitharamachandra opened 1 year ago
1
SpeechT5 - TTS - Tokenizer adding `▁` token between newly added Vietnamese characters

#63 GinUTE closed 11 months ago
1
ASR SpeechT5 training - model predicts same output for different inputs

#62 L7uan opened 1 year ago
1
Is end-to-end S2ST possible with Speecht5?

#61 elia-ashraf opened 1 year ago
0
Generate the N-best (top few) hypotheses

#60 cyfer0618 opened 1 year ago
0
Reproduce ASR experiment results in Hugging Face

#59 jjyaoao closed 1 year ago
0
Voice Conversion - Error with Some Mono, 16kHz, 16bit Audio

#58 fabiocat93 opened 1 year ago
3
Getting TTS output voice close to the training data - Finetuning on different language

#57 Srija616 opened 1 year ago
2
pretrain loss

#56 MarsMeng1994 opened 1 year ago
4
Bump scipy from 1.5.4 to 1.10.0 in /VATLM/vat_hubert

#55 dependabot[bot] opened 1 year ago
0
VATLM: Error when loading finetuned checkpoints for infer_s2s

#54 naraysa opened 1 year ago
0
Pretraining SpeechT5, meet problems about batch_sampler in multitask_dataset. Should I get idx and bin files of data one by one (wav) or get all of them in only two file(idx and bin each have one)

#53 Lemonaddeee opened 1 year ago
0
SpeechUT inference error in en_fr checkpoint

#52 ytf-philp opened 1 year ago
1
Using SpeechT5 Large for TTS

#51 imranmaj opened 1 year ago
0
SpeechT5: extracting Chinese speaker embedding

#50 QQ-777777 opened 1 year ago
6
SpeechT5-tts fine-tuned on Chinese

#49 qlmbeck opened 1 year ago
4
add link to Hugging Face fine-tuning example

#48 hollance closed 1 year ago
1
The link for Prosody-SpeechT5 in the Readme is dead/404

#47 svantana closed 1 year ago
2
SpeechLM

#46 blueblue-bubble closed 1 year ago
2
SpeechT5：how much epoch is set

#45 QQ-777777 closed 1 year ago
5
how to pause between two words ?

#43 hulk10425 opened 1 year ago
2