NetEase-GameAI / Face2FaceRHO

The Official PyTorch Implementation for Face2Face^ρ (ECCV2022)
BSD 3-Clause "New" or "Revised" License
213 stars 35 forks source link

Face2Faceρ: Official Pytorch Implementation

Environment

Training data

Our framework relies on a large video dataset containing many identities, such as VoxCeleb. For each video frame, the following data is required:

The pre-processed data should be organized as follows (an example dataset containing two video sequences is provided in ./dataset/VoxCeleb):

   - dataset
       - <dataset_name>
           - list.txt                             ---list of all videos 
           - id10001#7w0IBEWc9Qw#000993#001143    ---video folder 1 (should be named as <person_id>#<video_id>)
               - img                              ---video frame
                   - 1.jpg
                   - 2.jpg
                   - ...
               - landmark                         ---landmark coordinates for each frame 
                   - 1.txt
                   - 2.txt
                   - ...
               - headpose                         ---head pose coefficients for each frame 
                   - 1.txt
                   - 2.txt
                   - ...
               - mask                             ---face mask for each frame 
                   - 1.png
                   - 2.png
                   - ...
            - id10009#AtavJVP4bCk#012568#012652   ---video folder 2
               ...

Training

Testing

Note that the resulting quality may deteriorate by using DECA 3DMM fitting algorithm, since our original 3DMM fitting algorithm is more stable and robust than DECA, and the pre-configured 72 keypoints on the FLAME mesh template are also slightly different from our original configuration.