facebookresearch / multiface

Hosts the Multiface dataset, which is a multi-view dataset of multiple identities performing a sequence of facial expressions.
Other
718 stars 50 forks source link

aligning meshes with images #33

Closed treder closed 1 year ago

treder commented 1 year ago

I want to bring the meshes into image space but the results are still off a bit. As an example I will show images from m--20190828--1318--002645310--GHS/images/EXP_free_face

meta_images2_only

Now the following transformation are available:

The following figure shows the vertices

mesh12

Next figure shows the operations on vertices v after applying intrinsics K and extrinsics RT. Note that the vertex coordinates are homogenous (by appending a column of 1's) and the matrices are extended/transposed accordingly

mesh345

Now we can take the meshes from the last row and superimpose them on the images to see the match. The match is noticeably off. Same for another example further down. So the following questions:

  1. Is the computation correct or am I missing something?
  2. Can you provide the correct formula?
  3. If you have a function that performs [vertices_in_world_space, K, RT] -> [vertices_in_image_space] do you mind sharing it?

Thanks!

meta_images2 meta_mesh_and_images
vexilligera commented 1 year ago

Hi, for your issue the problem might be in the shifting step. KRT already projects 3D points into 2D image space

you may refer to the code in data loader:

transf = np.genfromtxt(
    "{}/{}/{}_transform.txt".format(self.meshpath, sentnum, frame)
)
R_f = transf[:3, :3]
t_f = transf[:3, 3]
campos = np.dot(R_f.T, self.campos[cam] - t_f).astype(np.float32)
view = campos / np.linalg.norm(campos)

extrin, intrin = self.krt[cam]["extrin"], self.krt[cam]["intrin"]
R_C = extrin[:3, :3]
t_C = extrin[:3, 3]
camrot = np.dot(R_C, R_f).astype(np.float32)
camt = np.dot(R_C, t_f) + t_C
camt = camt.astype(np.float32)

M = intrin @ np.hstack((camrot, camt[None].T))

and then use the matrix M with the denormalized vertex coordinates in homogenous coordinates, e.g.,

projected = M @ vert_homo
projected_2d = projected / projected[:, -1]

This should give you 2d locations in the original image resolution (2048 x 1334) which looks like 295679256_1434121193771921_9044146199318546580_n

treder commented 1 year ago

Thanks for the quick reply. You are right, there must have been some error with the shift, your code works. I made a minimal reproducible example (for future reference) using your dataloader

import numpy as np
import matplotlib.pyplot as plt
from dataset import Dataset 

# instantiate dataset and get sample
data_dir = 'PATH/TO/m--20190828--1318--002645310--GHS/'
multiface = Dataset(data_dir, krt_dir = data_dir + 'KRT', framelistpath = data_dir + 'frame_list_selected.txt')
sample = multiface[11]

where frame_list_selected.txt is my selection of frames and the sampled frame is 003251.

# get verts and denormalize
M = sample['M']
verts = sample['aligned_verts']

verts *= multiface.vertstd
verts += multiface.vertmean.reshape((-1, 3))

# homogenous coordinates and project
vert_homo = np.concatenate((verts, np.ones((verts.shape[0], 1))), axis=1)

projected = vert_homo @ M.T
projected_2d = projected / projected[:, -1:]

# plot
fig=plt.figure(figsize=(20,12), dpi= 100, facecolor='w', edgecolor='k')

plt.imshow(sample['photo'])
plt.plot(projected_2d[:,0], projected_2d[:,1], 'r.', markersize=1)
image