kyotovision-public / dynamic-3d-gaze-from-afar

MIT License
49 stars 6 forks source link

In-the-wild data inference code #7

Closed iamshushu closed 2 years ago

iamshushu commented 2 years ago

Hi there! Thanks for your outstanding work! I am really interested in your work. Apparently, only demo code for GAFA dataset is available. How can I apply to my In-the-wild dataset? Do you have any plan to release the code? As I am a new-bie on AI field, It would be so thankful if you give an kind instruction to me. Thank you.

SomaNonaka commented 2 years ago

Unfortunately we don't provide preprocessing code for other datasets. Please prepare 1. cropped body image, 2. a binary mask indicating head position, and 3. body center shift between frames on your own. When prepared, you should be able to run our model in the same way as in demo.ipynb with the code below.

model = GazeNet(n_frames=7)
model.load_state_dict(torch.load('./models/weights/gazenet_GAFA.pth', map_location='cpu')['state_dict'])
model.cuda()

image, head_mask, body_dv = YOUR_DATA['image'], YOUR_DATA['head_mask'], YOUR_DATA['body_dv']

with torch.no_grad():
    image = image.cuda().unsqueeze(0)
    head_mask = head_mask.cuda().unsqueeze(0)
    body_dv = body_dv.cuda().unsqueeze(0)
    gaze_res, head_res, body_res = model(image, head_mask, body_dv)

gaze_direction = gaze_res['direction'][0].cpu()
gaze_confidence = gaze_res['kappa'][0].cpu()