In-the-wild data inference code

Unfortunately we don't provide preprocessing code for other datasets. Please prepare 1. cropped body image, 2. a binary mask indicating head position, and 3. body center shift between frames on your own. When prepared, you should be able to run our model in the same way as in demo.ipynb with the code below.

model = GazeNet(n_frames=7)
model.load_state_dict(torch.load('./models/weights/gazenet_GAFA.pth', map_location='cpu')['state_dict'])
model.cuda()

image, head_mask, body_dv = YOUR_DATA['image'], YOUR_DATA['head_mask'], YOUR_DATA['body_dv']

with torch.no_grad():
    image = image.cuda().unsqueeze(0)
    head_mask = head_mask.cuda().unsqueeze(0)
    body_dv = body_dv.cuda().unsqueeze(0)
    gaze_res, head_res, body_res = model(image, head_mask, body_dv)

gaze_direction = gaze_res['direction'][0].cpu()
gaze_confidence = gaze_res['kappa'][0].cpu()

kyotovision-public / dynamic-3d-gaze-from-afar

In-the-wild data inference code #7