microsoft SpeechT5 issues

microsoft / SpeechT5

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

MIT License

1.16k stars 113 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Speech conversion: process whole input without stopping

#92 holmbuar closed 1 week ago
1
gaokao_audio can not be download? something error

#91 liyunlongaaa opened 2 weeks ago
1
Why can WavLLM understand audio sounds as well?

#90 BenoitWang opened 3 weeks ago
1
Setup Error about WavLLM

#89 XqZeppelinhead0702 opened 1 month ago
4
Request for Assistance with VATLM Implementation: Accessing Wave2Vec Model File

#88 yeonju7kim opened 1 month ago
0
I found minor typo in Readme

#87 yeonju7kim opened 1 month ago
0
Please fix the broken download link!!! So many models cann't be used without checkpoint.

#86 world1tree opened 1 month ago
0
How to fine-tune SpeechT5 HifiGAN vocoder?

#85 yukiarimo opened 1 month ago
0
soundfile.LibsndfileError: <exception str() failed>

#84 ciwei6107563 closed 1 month ago
0
Unable to Download wavLLM Due to Error

#83 minkyu119 opened 2 months ago
1
What languages are supported? How to specify a language?

#82 secsilm opened 3 months ago
0
SpeechUT does not have a link for download

#81 world1tree opened 4 months ago
2
What's the model_path and data_name on inference code?

#80 YepJin opened 4 months ago
1
Confusion/Question about SpeechT5SpeechDecoderPostnet output

#79 Student204161 opened 5 months ago
0
Error in loading WavLLM model

#78 rishabh004-ai opened 5 months ago
9
Single Task Training

#77 yangjiabupt closed 5 months ago
1
WavLLM checkpoint

#76 ming024 opened 5 months ago
5
ASR fine-tuning loss goes to zero after several epochs

#75 yunigma closed 5 months ago
2
extract transorformer layer feature

#74 zbpjlc opened 7 months ago
2
Does the pre-trained model for hidden unit tokenizer use speaker embeddings?

#73 Kodhandarama opened 7 months ago
0
What is the time taken to converge for the hidden unit tokenizer?

#72 Kodhandarama opened 7 months ago
0
Link to train_960.tsv is broken

#71 Kodhandarama opened 8 months ago
0
"SpeechT5" on Android OS

#70 taeyeonlee opened 8 months ago
0
British English TTS model

#69 omega3 closed 5 months ago
1
Text feature extraction using SpeechLM

#68 wonjune-kang opened 9 months ago
0
Baseline implementation

#67 ussenuk opened 10 months ago
1
How to setting language when do S2T

#66 nhha1602 opened 10 months ago
1
是否支持中文转语音？

#65 xxm1668 opened 11 months ago
4
The size of tensor a (674) must match the size of tensor b (600) at non-singleton dimension 1

#64 poojitharamachandra opened 11 months ago
1
SpeechT5 - TTS - Tokenizer adding `▁` token between newly added Vietnamese characters

#63 GinUTE closed 9 months ago
1
ASR SpeechT5 training - model predicts same output for different inputs

#62 L7uan opened 1 year ago
1
Is end-to-end S2ST possible with Speecht5?

#61 elia-ashraf opened 1 year ago
0
Generate the N-best (top few) hypotheses

#60 cyfer0618 opened 1 year ago
0
Reproduce ASR experiment results in Hugging Face

#59 jjyaoao closed 1 year ago
0
Voice Conversion - Error with Some Mono, 16kHz, 16bit Audio

#58 fabiocat93 opened 1 year ago
3
Getting TTS output voice close to the training data - Finetuning on different language

#57 Srija616 opened 1 year ago
2
pretrain loss

#56 MarsMeng1994 opened 1 year ago
4
Bump scipy from 1.5.4 to 1.10.0 in /VATLM/vat_hubert

#55 dependabot[bot] opened 1 year ago
0
VATLM: Error when loading finetuned checkpoints for infer_s2s

#54 naraysa opened 1 year ago
0
Pretraining SpeechT5, meet problems about batch_sampler in multitask_dataset. Should I get idx and bin files of data one by one (wav) or get all of them in only two file(idx and bin each have one)

#53 Lemonaddeee opened 1 year ago
0
SpeechUT inference error in en_fr checkpoint

#52 ytf-philp opened 1 year ago
1
Using SpeechT5 Large for TTS

#51 imranmaj opened 1 year ago
0
SpeechT5: extracting Chinese speaker embedding

#50 QQ-777777 opened 1 year ago
6
SpeechT5-tts fine-tuned on Chinese

#49 qlmbeck opened 1 year ago
4
add link to Hugging Face fine-tuning example

#48 hollance closed 1 year ago
1
The link for Prosody-SpeechT5 in the Readme is dead/404

#47 svantana closed 1 year ago
2
SpeechLM

#46 blueblue-bubble closed 1 year ago
2
SpeechT5：how much epoch is set

#45 QQ-777777 closed 1 year ago
5
how to pause between two words ?

#43 hulk10425 opened 1 year ago
2
how to fine tune sid on pretrained model？

#42 haha010508 closed 1 year ago
11