ming024 FastSpeech2 issues

ming024 / FastSpeech2

An implementation of Microsoft's "FastSpeech 2: Fast and High-Quality End-to-End Text to Speech"

MIT License

1.69k stars 515 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

CUDA out of memory

#233 gaoyiyao opened 1 month ago
0
train.yaml

#232 gaoyiyao opened 1 month ago
0
train/py

#231 gaoyiyao opened 1 month ago
0
preprocess.py

#230 gaoyiyao opened 1 month ago
0
FileNotFoundError: [Errno 2] No such file or directory: '/root/FastSpeech2-master/preprocessed_data/LJSpeech/mel/LJSpeech-mel-LJ016-0364.npy'

#229 gaoyiyao opened 1 month ago
0
data

#228 gaoyiyao opened 1 month ago
0
How to train with Indian Accent

#227 Jainu-s opened 3 months ago
2
Discrepancy in the Number of Decoder Layers

#226 shreeshailgan opened 3 months ago
0
MFA version

#225 shreeshailgan closed 2 months ago
4
[CONTRIBUTION] Speech Dataset Generator

#224 davidmartinrius opened 4 months ago
0
ran out of input

#223 ariameetgit closed 4 months ago
0
not found modules

#222 yumoqing opened 6 months ago
0
Minor bug in loading vocoders

#221 Mahyar24 opened 6 months ago
0
how to slove indexSelectSmallIndex: block: [1,0,0], thread: [32,0,0] Assertion `srcIndex < srcSelectDimSize` failed when training ASILLE3

#220 foolishcqx opened 6 months ago
1
Duration of synthesis output is very short

#219 hplanmuc opened 7 months ago
0
The numbers of the audio samples and speakers mismatch of LibriTTS dataset

#218 shiyanpei0826 opened 7 months ago
0
mac m1 usability fix

#217 hoeen opened 8 months ago
0
求救！！在VarianceAdaptor中进行pith_embeding的时候显示编码器输出张量x和音素嵌入张量形状不同无法相加

#216 aaqq112 closed 8 months ago
0
Minor interface changes

#215 Daniel-Chin opened 8 months ago
1
inference also requires unzipping hifigan checkpoints

#214 Daniel-Chin opened 8 months ago
0
Inconvergence in pitch and energy loss

#213 zhoufqing opened 9 months ago
0
fine-tuning issue

#212 zhoufqing opened 9 months ago
1
there is an error scipy 1.5.0 while import

#211 hevenangel opened 9 months ago
0
ValueError: numpy.ndarray size changed, may indicate binary incompatibility. Expected 96 from C header, got 80 from PyObject

#210 WJHBLUESAPPHIRE opened 10 months ago
2
Unused character embeddings?

#209 g-milis opened 10 months ago
4
How do I align my data

#208 hkeliang opened 10 months ago
2
Multi-language support in one sentence

#207 shirubei opened 10 months ago
1
Fluctuating training loss

#206 299792459b opened 10 months ago
2
1

#205 sunnnnnnnny closed 11 months ago
0
RuntimeError: The size of tensor a (33) must match the size of tensor b (36) at non-singleton dimension 1

#204 ltydd opened 11 months ago
1
Bump scipy from 1.5.0 to 1.10.0

#203 dependabot[bot] opened 1 year ago
0
RuntimeError: Error(s) in loading state_dict for FastSpeech2

#202 fangg2000 closed 1 year ago
1
Fix for e_control not used during synthesis #200

#201 lordzuko opened 1 year ago
0
e_control not used during synthesis

#200 lordzuko opened 1 year ago
0
About fine-tuning issues.

#199 ltydd opened 1 year ago
4
How do you train the mfa acoustic model?

#198 SandroChen opened 1 year ago
1
FastSpeech 2s

#197 izzajalandoni opened 1 year ago
0
Pretained model link is invalid

#196 Nueve879 closed 1 year ago
0
Frequency of LibriTTS data. 24000 or 22050?

#195 malu01 opened 1 year ago
0
Should we rely on tensorboard's output for duraion, pitch and energy?

#194 aidosRepoint opened 1 year ago
0
model cantnot fit to data, and test voice is too bad when i use the paper configuration

#193 hhm853610070 opened 1 year ago
3
fix typo

#192 zaidalyafeai closed 1 year ago
0
What should I do if I want to use phonemes and words to generate sentences at the same time?

#191 tuntun990606 closed 1 year ago
0
synthesize.py LibriTTS RuntimeError: CUDA error: device-side assert triggered

#190 Bingtai1015 opened 1 year ago
1
How about adding a discriminator to the Fastspeech2 to improve the naturalness of the spectrum？

#189 Bingtai1015 opened 1 year ago
0
aishell3处理：使用mfa官方dict和声学模型处理aishell3

#188 tuntun990606 opened 1 year ago
16
A custom text for inference

#187 WGook opened 1 year ago
0
Override predicted durations with custom phoneme durations

#186 mtresearcher closed 1 year ago
1
The duration.npy file is picking zero as duration value for the phones with short duration in textgrids

#185 nayanjha16 opened 1 year ago
0
fix max_wav_value 32768.0 to 32767.0

#184 netpi opened 1 year ago
0