Open lokvke opened 5 months ago
It seems like a error caused by index out of bounds. Can you provide more details? Since the code should have convert the audio to 16k and video to 25 fps.
@lokvke, could you please attempt it using an audio file longer than 10 seconds? In my testing, it consistently fails when the provided audio is less than 8 seconds.
Hey @yerfor. I tried running with longer audio clips as well. For the same audio clip, I tried the full length (around 1min 30s) and a 59s segment, both failed with a similar error, just the index value mention was different (but the same between multiple runs). It seems like it worked for a sample that was around 40s long. All samples were encoded to 16kHz successfully and as far as I can tell, the error seems to happen in the exact same line. Is there any other detail I can provide for this to help debug this issue ?
Hi, I have the same problem. Always IndexError appears and same error with different lengths of drive audio. How can I solve this problem?
in the inject_blink_to_lm68 function, when the generated video contatins 676 frames, T=676. So when i=675, j=1, the idx=676(out of index), here is my solution:
idx = i % (i + j)
(ps: the blinking result seems not very natural)
in the inject_blink_to_lm68 function, when the generated video contatins 676 frames, T=676. So when i=675, j=1, the idx=676(out of index), here is my solution:
idx = i % (i + j)
(ps: the blinking result seems not very natural)
blink_factor_lst = np.array([0.1, 0.5, 0.7, 1.0, 0.7, 0.5, 0.1]) # * 0.9
in the inject_blink_to_lm68 function
. Maybe you can try different values to improve the naturalness of eye blink.in the inject_blink_to_lm68 function, when the generated video contatins 676 frames, T=676. So when i=675, j=1, the idx=676(out of index), here is my solution: idx = i % (i + j) (ps: the blinking result seems not very natural)
- Hi, thanks for your comment. I will update the mentioned modification in the latest commit.
- As for the blinking results, the blink motion is controlled by the hard-coded
blink_factor_lst = np.array([0.1, 0.5, 0.7, 1.0, 0.7, 0.5, 0.1]) # * 0.9
in theinject_blink_to_lm68 function
. Maybe you can try different values to improve the naturalness of eye blink.
is this issue fixed now??
| load 'model' from 'checkpoints/audio2motion_vae/model_ckpt_steps_400000.ckpt', strict=True | WARN: checkpoints/motion2video_nerf/may_torso/lm3d_radnerf_torso.yaml not exist. | load 'model' from 'checkpoints/motion2video_nerf/may_torso/model_ckpt_steps_250000.ckpt', strict=True trainval: Smooth head trajectory (rotation and translation) with a window size of 7 /data/zssy-digital-human/projects/gpp/tasks/radnerfs/dataset_utils.py:263: UserWarning: To copy construct from a tensor, it is recommended to use sourceTensor.clone().detach() or sourceTensor.clone().detach().requiresgrad(True), rather than torch.tensor(sourceTensor). self.lm68s = torch.tensor(self.lm2ds[:, index_lm68_from_lm478, :]) Extracted wav file (16khz) from data/raw/val_wavs/8-27s.wav to data/raw/val_wavs/8-27s_16k.wav. Loading the HuBERT Model... /data/home/yaokj5/anaconda3/envs/geneface/lib/python3.9/site-packages/torch/_utils.py:776: UserWarning: TypedStorage is deprecated. It will be removed in the future and UntypedStorage will be the only storage class. This should only matter to you if you are using storages directly. To access UntypedStorage directly, use tensor.untyped_storage() instead of tensor.storage() return self.fget.get(instance, owner)() Loading the Wav2Vec2 Processor... Traceback (most recent call last): File "/data/zssy-digital-human/projects/gpp/inference/genefacepp_infer.py", line 542, in
GeneFace2Infer.example_run(inp)
File "/data/zssy-digital-human/projects/gpp/inference/genefacepp_infer.py", line 490, in example_run
infer_instance.infer_once(inp)
File "/data/zssy-digital-human/projects/gpp/inference/genefacepp_infer.py", line 180, in infer_once
out_name = self.forward_system(samples, inp)
File "/data/home/yaokj5/anaconda3/envs/geneface/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(*args, *kwargs)
File "/data/zssy-digital-human/projects/gpp/inference/genefacepp_infer.py", line 475, in forward_system
self.forward_audio2secc(batch, inp)
File "/data/home/yaokj5/anaconda3/envs/geneface/lib/python3.9/site-packages/torch/utils/_contextlib.py", line 115, in decorate_context
return func(args, *kwargs)
File "/data/zssy-digital-human/projects/gpp/inference/genefacepp_infer.py", line 384, in forward_audio2secc
cano_lm3d = inject_blink_to_lm68(cano_lm3d)
File "/data/zssy-digital-human/projects/gpp/inference/genefacepp_infer.py", line 103, in inject_blink_to_lm68
lm68[idx, 36:48] = lm68[idx, 36:48] (1-blink_factor) + closed_eye_lm68[idx, 36:48] * blink_factor
IndexError: index 676 is out of bounds for dimension 0 with size 676