SadTalker webui extension for automatic1111:
Source: https://github.com/Winfredy/SadTalker
Version: a810cbe1 (Tue Jun 6 16:09:05 2023)
Installed via automatic1111 extension tab
Description of the bug:
The extension crashes when the pose style slider is set to 46
To reproduce:
1) put an image and an audio file
2) any setting (face model resolution, preprocess mode, still mode and batch size)
3) put the Pose style slider to 46
The problem:
Disclaimer: I'm not a developper, so...
The config/auido2exp.yaml and config/auido2pose.yaml configuration files state that there are 46 classes (for pose style) but the slider goes from 0 to 46 (both values are included) which means a total of 47 classes. This is bigger than the allowed 46 classes stated in the auido2exp.yaml and auido2pose.yaml configuration files.
Ultimately, this seems to be what is causing the crash.
I could trace the error as follows:
0) In app.py: Assume pose_style slider returns: 46
1) gradio_demo.py makes a call (gradio_demo.test on line 82) to audio_to_coeff(in test_audio2coeff.py), with pose_style=46:
coeff_path = self.audio_to_coeff.generate(batch, save_dir, pose_style)
2) In test_audio2coeff.py (line 84), it sets the batch['class']variable to:
batch['class'] = torch.LongTensor([pose_style]).to(self.device)
Then, on line 85, it makes a call to audio2pose_model.test (in audio2pose.py):
3) In audio2pose_model.test, on line 73, it calls:
batch = self.netG.test(batch)
And this is where the error occurs, because netG=CVAE(cfg) (line 20 in audio2pose.py) and the initial config is taken from the yaml config files. Then it calls the net for generation but with an incorrect (pose_style) class in batch and thus doesn't correspond the downloaded model.
Solution:
My first guess is that the maximum value for the pose style slider should simply be put to: 45 (i.e. NUM_CLASSES-1, instead of 46)
It could be that it is just a typo in app.py
Correction:
In app.py, on line 104:
pose_style = gr.Slider(minimum=0, maximum=45, step=1, label="Pose style", value=0) #
Object:
SadTalker webui extension for automatic1111: Source: https://github.com/Winfredy/SadTalker Version: a810cbe1 (Tue Jun 6 16:09:05 2023) Installed via automatic1111 extension tab
Description of the bug:
The extension crashes when the pose style slider is set to 46
To reproduce:
1) put an image and an audio file 2) any setting (face model resolution, preprocess mode, still mode and batch size) 3) put the Pose style slider to 46
The problem:
Disclaimer: I'm not a developper, so...
The config/auido2exp.yaml and config/auido2pose.yaml configuration files state that there are 46 classes (for pose style) but the slider goes from 0 to 46 (both values are included) which means a total of 47 classes. This is bigger than the allowed 46 classes stated in the auido2exp.yaml and auido2pose.yaml configuration files.
Ultimately, this seems to be what is causing the crash.
I could trace the error as follows:
0) In
app.py
: Assume pose_style slider returns: 461)
gradio_demo.py
makes a call (gradio_demo.test
on line 82) toaudio_to_coeff
(intest_audio2coeff.py
), with pose_style=46:coeff_path = self.audio_to_coeff.generate(batch, save_dir, pose_style)
2) In
test_audio2coeff.py
(line 84), it sets thebatch['class']
variable to:batch['class'] = torch.LongTensor([pose_style]).to(self.device)
Then, on line 85, it makes a call to
audio2pose_model.test
(in audio2pose.py):results_dict_pose = self.audio2pose_model.test(batch)
3) In
audio2pose_model.test
, on line 73, it calls:batch = self.netG.test(batch)
And this is where the error occurs, because
netG=CVAE(cfg)
(line 20 inaudio2pose.py
) and the initial config is taken from the yaml config files. Then it calls the net for generation but with an incorrect (pose_style) class inbatch
and thus doesn't correspond the downloaded model.Solution:
My first guess is that the maximum value for the pose style slider should simply be put to: 45 (i.e. NUM_CLASSES-1, instead of 46) It could be that it is just a typo in
app.py
Correction:
In
app.py
, on line 104:pose_style = gr.Slider(minimum=0, maximum=45, step=1, label="Pose style", value=0) #