yerfor / GeneFace

GeneFace: Generalized and High-Fidelity 3D Talking Face Synthesis; ICLR 2023; Official code
MIT License
2.52k stars 294 forks source link

Errors in processing May.mp4 (Step 3): The code extract_3dmm.py under process_data.sh fails to calculate 3DMM throwing serveral errors. #150

Open avia8abhi opened 1 year ago

avia8abhi commented 1 year ago

I have been facing a few errors while executing step 3 of GeneFace-1.1.0 while running the command "CUDA_VISIBLE_DEVICES=1 data_gen/nerf/process_data.sh $VIDEO_ID" given under "process_target_person_video.md" file to process May.mp4 video. It executes till step 8 and for step 9 which calculates 3DMM, it crashes throwing several errors.

yerfor commented 1 year ago

Hi, can you provide more details about the error so that I can help you out?

avia8abhi commented 1 year ago

It executes till step 8 (calculate audio features) with few warnings and errors but fails at step 9, Here's the exact error I am getting after running # 9. Calculate 3DMM given in process_data.sh.

(geneface) root@C.6305573:~/GeneFace-1.1.0$ python data_gen/nerf/extract_3dmm.py --video_id=$VIDEO_ID loading the model from deep_3drecon/checkpoints/facerecon/epoch_20.pth loading video ... extracting 2D facial landmarks ...: 100%|██████████████████████████████████████████████| 6073/6073 [03:34<00:00, 28.35it/s] start extracting 3DMM...: 0%| | 0/189 [00:00<?, ?it/s] Traceback (most recent call last): Screenshot from 2023-06-15 09-45-36

File "/root/GeneFace-1.1.0/data_gen/nerf/extract_3dmm.py", line 112, in process_video(video_fname, out_fname, skip_tmp=False) File "/root/GeneFace-1.1.0/data_gen/nerf/extract_3dmm.py", line 75, in process_video coeff, align_img = face_reconstructor.recon_coeff(batched_images, batched_lm5, return_image = True) File "/root/miniconda3/envs/geneface/lib/python3.9/site-packages/torch/autograd/grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "/root/GeneFace-1.1.0/deep_3drecon/reconstructor.py", line 52, in recon_coeff align_im, lm = self.preprocess_data(img, lm5, self.lm3d_std) File "/root/GeneFace-1.1.0/deep_3drecon/reconstructor.py", line 40, in preprocessdata , im, lm, _ = align_img(Image.fromarray(convert_to_np(im)), convert_to_np(lm), convert_to_np(lm3d_std)) File "/root/GeneFace-1.1.0/deep_3drecon/util/preprocess.py", line 205, in align_img trans_params = np.array([w0, h0, s, t[0], t[1]], dtype=np.float32) ValueError: setting an array element with a sequence. The requested array has an inhomogeneous shape after 1 dimensions. The detected shape was (5,) + inhomogeneous part.

avia8abhi commented 1 year ago

Would appreciate if you could give your suggestions and solutions for getting through the errors by today, since it's been a week i am not able to proceed further.

yerfor commented 1 year ago

hi, I have not confronted with this error, could you please print the w0, h0, s, t[0], t[1] as raised by:

File "/root/GeneFace-1.1.0/deep_3drecon/util/preprocess.py", line 205, in align_img trans_params = np.array([w0, h0, s, t[0], t[1]], dtype=np.float32)

avia8abhi commented 1 year ago

I was able to fix the errors with trans_params = np.array([w0, h0, s, t[0], t[1]], dtype=np.float32), now in the next step i am getting this error where it says it couldn't found aud_deepspeech.npy..............

here's the exact error:

(geneface) root@C.6305573:~/GeneFace-1.1.0$ python data_gen/nerf/binarizer.py --config=egs/datasets/videos/${VIDEO_ID}/lm3d_radnerf.yaml | Unknow hparams: [] | Hparams chains: ['egs/egs_bases/radnerf/base.yaml', 'egs/egs_bases/radnerf/lm3d_radnerf.yaml', 'egs/datasets/videos/May/lm3d_radnerf.yaml'] | Hparams: accumulate_grad_batches: 1, ambient_out_dim: 2, amp: True, base_config: ['egs/egs_bases/radnerf/lm3d_radnerf.yaml'], binary_data_dir: data/binary/videos, bound: 1, camera_offset: [0, 0, 0], camera_scale: 4.0, clip_grad_norm: 0, clip_grad_value: 0, cond_out_dim: 64, cond_type: idexp_lm3d_normalized, cond_win_size: 1, cuda_ray: True, debug: False,density_thresh: 10, density_thresh_torso: 0.01, desired_resolution: 2048, dt_gamma: 0.00390625, eval_max_batches: 100, exp_name: , far: 0.9, finetune_lips: True, finetune_lips_start_iter: 200000, geo_feat_dim: 128, grid_interpolation_type: linear, grid_size: 128, grid_type: tiledgrid, gui_fovy: 21.24, gui_h: 512, gui_max_spp: 1, gui_radius: 3.35, gui_w: 512, hidden_dim_ambient: 128, hidden_dim_color: 128, hidden_dim_sigma: 128, individual_embedding_dim: 4, individual_embedding_num: 13000, infer: False, infer_audio_source_name: , infer_bg_img_fname: , infer_c2w_name: , infer_cond_name: , infer_lm3d_clamp_std: 2.5, infer_lm3d_lle_percent: 0.0, infer_lm3d_smooth_sigma: 0.0, infer_out_video_name: , infer_scale_factor: 1.0, infer_smo_std: 0.0, infer_smooth_camera_path: True, infer_smooth_camera_path_kernel_size: 7, lambda_ambient: 0.1, lambda_lpips_loss: 0.01, lambda_weights_entropy: 0.0001, load_ckpt: ,load_imgs_to_memory: False, log2_hashmap_size: 16, lr: 0.0005, max_ray_batch: 4096, max_steps: 16, max_updates: 250000, min_near: 0.05, n_rays: 65536, near: 0.3, num_ckpt_keep: 1,num_layers_ambient: 3, num_layers_color: 2, num_layers_sigma: 3, num_sanity_val_steps: 2, num_steps: 16, num_valid_plots: 5, optimizer_adam_beta1: 0.9, optimizer_adam_beta2: 0.999, print_nan_grads: False, processed_data_dir: data/processed/videos, raw_data_dir: data/raw/videos, resume_from_checkpoint: 0, save_best: True, save_codes: ['tasks', 'modules', 'egs'], save_gt: True,scheduler: exponential, seed: 9999, smo_win_size: 5, smooth_lips: False, task_cls: tasks.radnerfs.radnerf.RADNeRFTask, tb_log_interval: 100, torso_head_aware: False, torso_individual_embedding_dim: 8, torso_shrink: 0.8, update_extra_interval: 16, upsample_steps: 0, use_window_cond: True, val_check_interval: 2000, valid_infer_interval: 10000, valid_monitor_key: val_loss,valid_monitor_mode: min, validate: False, video_id: May, warmup_updates: 0, weight_decay: 0, with_att: True, work_dir: ,loading deepspeech ... Traceback (most recent call last): File "/root/GeneFace-1.1.0/data_gen/nerf/binarizer.py", line 277, in binarizer.parse(hparams['video_id']) File "/root/GeneFace-1.1.0/data_gen/nerf/binarizer.py", line 267, in parse ret = load_processed_data(processed_dir) File "/root/GeneFace-1.1.0/data_gen/nerf/binarizer.py", line 86, in load_processed_data deepspeech_features = np.load(deepspeech_npy_name) File "/root/miniconda3/envs/geneface/lib/python3.9/site-packages/numpy/lib/npyio.py", line 405, in load fid = stack.enter_context(open(os_fspath(file), "rb")) FileNotFoundError: [Errno 2] No such file or directory: 'data/processed/videos/May/aud_deepspeech.npy'

yerfor commented 1 year ago

Hi, it seems to be caused by the downloading of DeepSpeech ckpt was failed, so the process.sh just skip extracting the deepspeech. Maybe you could run this commandline to rerun the extracting deepspeech step, so the code will automatically download the deepspeech ckpt.

conda activate geneface
CUDA_VISIBLE_DEVICES=0 python data_util/process.py --video_id=May --task=2

Apologize for the late reply since I have been engaged in a new project. We plan to upload the code of GeneFace++ in Augest, and the reported bugs will be fixed then.

avia8abhi commented 1 year ago

No issues I can understand, glad to hear that Geneface++ would also be out! Step 3 got executed properly, now facing a few errors while executing the final command bash scripts/infer_lm3d_radnerf.sh in step 4. Posting the errors in a few hours...many thanks for your support!

yerfor commented 1 year ago

can you also post the error log here?