lucidrains voicebox-pytorch issues

lucidrains / voicebox-pytorch

Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch

MIT License

562 stars 45 forks source link

issues

Newest

Newest Most commented Recently updated Oldest Least commented Least recently updated

Default assignment of "cond".

#51 WECarol opened 1 month ago
0
Using WhisperSpeech Pre-trained Weights for TextToSemantic

#49 EomSooHwan closed 4 months ago
0
Fix CFG calculation

#47 Subuday closed 5 months ago
1
How to pick sigma?

#46 ex3ndr opened 5 months ago
3
Probably invalid infill logic

#45 ex3ndr opened 5 months ago
0
Mel model

#44 lixuyuan102 opened 5 months ago
10
Training Unconditional Model

#43 nrocketmann opened 6 months ago
15
Training TTS

#42 Subuday opened 6 months ago
2
sync

#41 nrocketmann closed 6 months ago
0
Fix unconditional sample generation

#40 lucasnewman closed 7 months ago
0
Fix conditioning to allow speech in-filling

#39 lucasnewman closed 8 months ago
3
Don't explicitly set the step size, derive it from the number of steps instead

#38 lucasnewman closed 8 months ago
2
Disable gradients for null conditioning when CFG is enabled

#37 lucasnewman closed 8 months ago
2
Fix conditional drop for CFG when conditioning on semantic/phoneme tokens

#36 lucasnewman closed 8 months ago
1
speartts model

#35 happy-machine opened 8 months ago
3
Loss calculation

#34 YKoustubhRao closed 8 months ago
1
VoiceBox Training

#33 yiwei0730 closed 9 months ago
3
Fix DDP import

#32 lucasnewman closed 9 months ago
1
Multi-Voice/Multi-Speaker?

#31 fakerybakery closed 9 months ago
2
Fixed a bug where the scheduler would get None

#30 wassimseif closed 10 months ago
1
Audio samples?

#28 blx0102 closed 10 months ago
2
Add Accelerate-enabled trainer

#27 lucasnewman closed 10 months ago
1
Allow specifying semantic ids directly instead of generating them for training efficiency

#26 lucasnewman closed 10 months ago
0
Training Example

#25 YKoustubhRao opened 10 months ago
17
where to get the kmeans_path = './path/to/hubert/kmeans.bin file?

#24 furqan4545 opened 10 months ago
7
Deep network not converge

#23 lixuyuan102 closed 9 months ago
14
Dtype Issues on Inference

#22 nrocketmann opened 10 months ago
0
expose attention dropout

#21 yzmyyff closed 10 months ago
0
logs the ODE info with level debug

#19 yzmyyff closed 10 months ago
3
Pre-trained model weights

#18 Hades32 closed 9 months ago
3
use align hard as target if we use aligner

#17 manmay-nakhashi closed 10 months ago
7
fix null_cond dim

#16 chenht2021 closed 11 months ago
0
For reference

#15 chenht2021 closed 10 months ago
0
fix a mistake

#14 chenht2021 closed 11 months ago
1
a mistake in rotary embbeding

#13 chenht2021 closed 11 months ago
1
dataloader

#12 yiwei0730 closed 10 months ago
0
Apply mask to cond

#11 stevenhillis closed 11 months ago
1
suggestion for symmetric alibi implementation.

#10 seastar105 closed 11 months ago
7
Impl time embedding

#9 yzmyyff closed 11 months ago
4
input dim differs from model dim

#8 yzmyyff closed 11 months ago
1
input dimension and model dimension can be different

#7 yzmyyff closed 11 months ago
5
There might be a bug in the loss calculation

#6 yzmyyff closed 11 months ago
1
integrate aligner for phoneme overampling

#5 manmay-nakhashi closed 11 months ago
4
The usage code throws exception

#4 yzmyyff closed 11 months ago
1
Samples for audio and steps to run the experiment?

#3 RaiAmanRai opened 11 months ago
2
log mel func in torch

#2 bryanhpchiang closed 11 months ago
11