BolinLai / GLC

[BMVC2022, IJCV2023, Best Student Paper, Spotlight] Official codes for the paper "In the Eye of Transformer: Global-Local Correlation for Egocentric Gaze Estimation".
20 stars 3 forks source link

Bug in decoder.py: frames_idx list returned by decoder is wrong #9

Open ppalasek opened 2 months ago

ppalasek commented 2 months ago

Hi @BolinLai,

thank you for sharing your code!

I've been trying to train your model on a custom dataset, and I've noticed that the frame indices being returned by the decoder were not correct. As these are used to index into the ground truth gaze positions during training, this leads to training on wrong labels.

The line where the frame_idx list is created is here: https://github.com/BolinLai/GLC/blob/main/slowfast/datasets/decoder.py#L298C9-L298C19

I'm using PyAV version 12.0.0, which was giving me the following deprecation warning: "AVDeprecationWarning: Using frame.index is deprecated."

It might be caused because of using seek in https://github.com/BolinLai/GLC/blob/main/slowfast/datasets/decoder.py#L93

See e.g. https://github.com/PyAV-Org/PyAV/issues/33

The workaround I did was to replace

frames_idx = torch.tensor([frame.index for frame in video_frames])

by

frames_idx = torch.tensor([int(frame.pts / timebase) for frame in video_frames])

however, I'm not sure this is a general solution for the problem.

Posting here in case someone has the same issue.

Best, Petar