microsoft SpeechT5 issues

microsoft / SpeechT5

Unified-Modal Speech-Text Pre-Training for Spoken Language Processing

MIT License

1.09k stars 113 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

SpeechT5: Finetuned SID model

#31 entn-at closed 1 year ago
2
SpeechT5 pretrain

#30 benyang0506 opened 1 year ago
5
About the SpeechT5 pre-training curve

#29 benyang0506 closed 1 year ago
4
SpeechT5 Pretrain ERROR

#28 benyang0506 closed 1 year ago
1
Whether fp16 is enabled in VATLM during pre-training

#27 xiabingquan closed 1 year ago
2
SpeechLM：KeyError: 'text_transformer' while initing the SpeechLMConfig

#26 JunZhan2000 closed 1 year ago
2
VATLM: ModuleNotFoundError: No module named 'fairseq.data.audio.multi_corpus_dataset_audio'

#25 xiabingquan closed 1 year ago
6
Same benchmark, same architecture, but the WER is differenet, why?

#24 splinter21 closed 1 year ago
2
SpeechLM: How to train 'Phone-unit tokenizer for speech' using kaldi?

#23 YWMditto closed 1 year ago
7
Speech2C "Inf detected in output" while training

#22 Sreyan88 closed 1 year ago
4
Speech2C training error

#21 Sreyan88 closed 1 year ago
6
Missing SPM and Vocabulary files

#20 sumanthd17 closed 1 year ago
2
Port to Huggingface

#19 StephennFernandes closed 1 year ago
1
SpeechLM: How to resample phonemes' frame rate from 30ms to 20ms?

#18 Arrivederci closed 1 year ago
3
SpeechLM: how to prepare phoneme sequence for T2U generator

#17 cwang621 closed 1 year ago
5
SpeechT5: How to get speaker embeddings ?

#16 Arrivederci closed 1 year ago
12
Example values for finetuning asr

#15 YWMditto closed 1 year ago
18
Sample Rates are different between speech pre-training dataset and tts dataset

#14 Maggione closed 1 year ago
1
Combining speech and text in the encoder

#13 jacqle closed 1 year ago
1
Can you provide a voice conversion finetune recipe?

#12 hpjang closed 1 year ago
2
This repo is missing important files

#11 microsoft-github-policy-service[bot] closed 1 year ago
0
Adding Microsoft SECURITY.MD

#10 microsoft-github-policy-service[bot] closed 1 year ago
0
Text data preparation

#9 tskim9439 closed 1 year ago
3
No code for Speech Synthesis

#8 petervickers closed 1 year ago
4
ArgumentError in SpeechT5Task.add_args() when running fairseq-generate

#7 busukxuan closed 2 years ago
1
Does the quantizer is used when fine-tune the pretrained backbone for the downstream task ?

#6 zhhao1 closed 2 years ago
2
Difficulties loading pre-trained weights!

#5 sanchit-gandhi closed 2 years ago
2
Missing text_to_speech_dataset.py in speecht5/data

#4 ayushtues closed 2 years ago
1
How to load the pretrained models in pytorch

#3 ayushtues closed 2 years ago
5
Are you planning to open source the configuration of the baselines and downstream tasks?

#2 Maggione closed 2 years ago
1
how to pre-train on a custom dataset ?

#1 StephennFernandes closed 2 years ago
16