yl4579 AuxiliaryASR issues - Githubissues

yl4579 / AuxiliaryASR

Joint CTC-S2S Phoneme-level ASR for Voice Conversion and TTS (Text-Mel Alignment)

MIT License

108 stars 30 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

How to train ZH-EN duo language aligner？

#12 Stardust-minus opened 4 months ago
1
Error Message: RuntimeError: Argument #4: Padding size should be less than the corresponding input dimension, but got: padding (1024, 1024) at dimension 2 of input [1, 65621, 2]

#11 GUUser91 opened 8 months ago
3
Why is " " used as the blank in the CTCLoss?

#10 jamesparsloe opened 1 year ago
0
Multiple GPU training and changing to librosa mel spec?

#9 crypticsymmetry closed 1 year ago
2
Is there anyone who has used the phonemizer? Any advice, please, on how to change the code correctly

#8 ahmeftah closed 1 year ago
3
get error

#7 MMMMichaelzhang closed 2 years ago
28
why mel_spectrogam feature extracting using only MEL_PARAMS here?

#6 superhg closed 2 years ago
3
Update text_utils.py

#5 woters closed 2 years ago
0
About the loss

#4 Charlottecuc closed 2 years ago
4
how to make word_index_dict.txt

#3 Ruinmou closed 2 years ago
3
how to train for mandarin asr?

#2 MMMMichaelzhang closed 2 years ago
8
How much data did you use to train the model?

#1 Charlottecuc closed 2 years ago
1