issues
search
lucidrains
/
voicebox-pytorch
Implementation of Voicebox, new SOTA Text-to-speech network from MetaAI, in Pytorch
MIT License
562
stars
45
forks
source link
issues
Newest
Newest
Most commented
Recently updated
Oldest
Least commented
Least recently updated
Default assignment of "cond".
#51
WECarol
opened
1 month ago
0
Using WhisperSpeech Pre-trained Weights for TextToSemantic
#49
EomSooHwan
closed
4 months ago
0
Fix CFG calculation
#47
Subuday
closed
5 months ago
1
How to pick sigma?
#46
ex3ndr
opened
5 months ago
3
Probably invalid infill logic
#45
ex3ndr
opened
5 months ago
0
Mel model
#44
lixuyuan102
opened
5 months ago
10
Training Unconditional Model
#43
nrocketmann
opened
6 months ago
15
Training TTS
#42
Subuday
opened
6 months ago
2
sync
#41
nrocketmann
closed
6 months ago
0
Fix unconditional sample generation
#40
lucasnewman
closed
7 months ago
0
Fix conditioning to allow speech in-filling
#39
lucasnewman
closed
8 months ago
3
Don't explicitly set the step size, derive it from the number of steps instead
#38
lucasnewman
closed
8 months ago
2
Disable gradients for null conditioning when CFG is enabled
#37
lucasnewman
closed
8 months ago
2
Fix conditional drop for CFG when conditioning on semantic/phoneme tokens
#36
lucasnewman
closed
8 months ago
1
speartts model
#35
happy-machine
opened
8 months ago
3
Loss calculation
#34
YKoustubhRao
closed
8 months ago
1
VoiceBox Training
#33
yiwei0730
closed
9 months ago
3
Fix DDP import
#32
lucasnewman
closed
9 months ago
1
Multi-Voice/Multi-Speaker?
#31
fakerybakery
closed
9 months ago
2
Fixed a bug where the scheduler would get None
#30
wassimseif
closed
10 months ago
1
Audio samples?
#28
blx0102
closed
10 months ago
2
Add Accelerate-enabled trainer
#27
lucasnewman
closed
10 months ago
1
Allow specifying semantic ids directly instead of generating them for training efficiency
#26
lucasnewman
closed
10 months ago
0
Training Example
#25
YKoustubhRao
opened
10 months ago
17
where to get the kmeans_path = './path/to/hubert/kmeans.bin file?
#24
furqan4545
opened
10 months ago
7
Deep network not converge
#23
lixuyuan102
closed
9 months ago
14
Dtype Issues on Inference
#22
nrocketmann
opened
10 months ago
0
expose attention dropout
#21
yzmyyff
closed
10 months ago
0
logs the ODE info with level debug
#19
yzmyyff
closed
10 months ago
3
Pre-trained model weights
#18
Hades32
closed
9 months ago
3
use align hard as target if we use aligner
#17
manmay-nakhashi
closed
10 months ago
7
fix null_cond dim
#16
chenht2021
closed
11 months ago
0
For reference
#15
chenht2021
closed
10 months ago
0
fix a mistake
#14
chenht2021
closed
11 months ago
1
a mistake in rotary embbeding
#13
chenht2021
closed
11 months ago
1
dataloader
#12
yiwei0730
closed
10 months ago
0
Apply mask to cond
#11
stevenhillis
closed
11 months ago
1
suggestion for symmetric alibi implementation.
#10
seastar105
closed
11 months ago
7
Impl time embedding
#9
yzmyyff
closed
11 months ago
4
input dim differs from model dim
#8
yzmyyff
closed
11 months ago
1
input dimension and model dimension can be different
#7
yzmyyff
closed
11 months ago
5
There might be a bug in the loss calculation
#6
yzmyyff
closed
11 months ago
1
integrate aligner for phoneme overampling
#5
manmay-nakhashi
closed
11 months ago
4
The usage code throws exception
#4
yzmyyff
closed
11 months ago
1
Samples for audio and steps to run the experiment?
#3
RaiAmanRai
opened
11 months ago
2
log mel func in torch
#2
bryanhpchiang
closed
11 months ago
11