USTC3DV / FlashAvatar-code

[CVPR 2024] The official repo for FlashAvatar
MIT License
163 stars 21 forks source link

Any suggestion for better result of novel view synthesis #32

Open why986 opened 2 months ago

why986 commented 2 months ago

I tested the model you provided, and found the performance of novel view synthesis unsatisfying. As is shown by below video, I rotated the avatar from -60 to 60.

test

I am wondering if there is any advice for improving. Thank you for your help!

LSliu666 commented 1 month ago

Hello, may I ask how to perform a new angle drive

LSliu666 commented 1 month ago

@why986

why986 commented 1 month ago

Hi @LSliu666, I modified the codes in test.py.

        if train_type == 1 and rotate_camera:
            angles = np.linspace(np.deg2rad(-60), np.deg2rad(60), range_up-range_down)
        for frame_id in tqdm(range(range_down, range_up)):
            image_name_mica = str(frame_id).zfill(5) # obey mica tracking
            image_name_ori = str(frame_id+frame_delta).zfill(5)
            ckpt_path = os.path.join(mica_ckpt_dir, image_name_mica+'.frame')
            payload = torch.load(ckpt_path)

            flame_params = payload['flame']
            exp_param = torch.as_tensor(flame_params['exp'])
            eyes_pose = torch.as_tensor(flame_params['eyes'])
            eyelids = torch.as_tensor(flame_params['eyelids'])
            jaw_pose = torch.as_tensor(flame_params['jaw'])

            opencv = payload['opencv']
            w2cR = opencv['R'][0]
            w2cT = opencv['t'][0]
            if train_type == 1 and rotate_camera:
                R_theta = rotation_matrix([0, 1, 0], angles[frame_id-range_down])
                w2cR = R_theta @ w2cR
            R = np.transpose(w2cR) # R is stored transposed due to 'glm' in CUDA code
            T = w2cT

You can have a try.

LSliu666 commented 1 month ago

Can the trained model perform other actions @why986 Has the expert conducted any research

xiangjun-xj commented 2 days ago

Yes, one way is to add more images of large views to the training set. (It is not a generative model actually and cannot synthesize results of large novel views with front-view inputs.) Recently we are studying this challenging task and have made some progress. We will realease the paper soon.

why986 commented 2 days ago

Yes, one way is to add more images of large views to the training set. (It is not a generative model actually and cannot synthesize results of large novel views with front-view inputs.) Recently we are studying this challenging task and have made some progress. We will realease the paper soon.

Glad to see your progress! I am looking forward to your new paper.

LSliu666 commented 2 days ago

Hello, I would like to ask, for example, there are characters A and B, actions 1 and 2, who have trained A1, A2, B1, and B2 respectively. Can we use the trained A1 model to drive the actions of A2, or A1 model to drive B2, or even A1 to drive an untrained action A3, or C3 @xiangjun-xj