I have some questions about the inversion of StyleGAN-V generator in FaceForensics dataset.
In case of image (a single frame), projection works well.
However, when I tried to project the video (multiple frames at once),
I found that the projected video contains almost identical frames in the entire time step.
Is this a normal phenomenon?
For projecting the video (16 frames in my case), I change some codes in "src/scripts/project.py" as below:
adjust times step (0 to 16 frames)
In line 59, from
ts = torch.zeros(num_videos, 1, device=device)
to
ts = torch.arange(num_videos, 16, device=device)
make motion code trainable (comment out the line 110 and uncomment line 109)
extract target_features of real videos per frame, and change the distance as being measured between videos, not frames.
For example, in line 140,
dist = (target_features - synth_features).square().sum()
In batch dimension of target_features and synth_features, they have 16 frames of a single video, not different images as original code does.
Hello,
I have some questions about the inversion of StyleGAN-V generator in FaceForensics dataset.
In case of image (a single frame), projection works well. However, when I tried to project the video (multiple frames at once), I found that the projected video contains almost identical frames in the entire time step.
Is this a normal phenomenon?
For projecting the video (16 frames in my case), I change some codes in "src/scripts/project.py" as below:
adjust times step (0 to 16 frames) In line 59, from
ts = torch.zeros(num_videos, 1, device=device)
tots = torch.arange(num_videos, 16, device=device)
make motion code trainable (comment out the line 110 and uncomment line 109)
extract target_features of real videos per frame, and change the distance as being measured between videos, not frames. For example, in line 140,
dist = (target_features - synth_features).square().sum()
In batch dimension of target_features and synth_features, they have 16 frames of a single video, not different images as original code does.Thanks,