SSA generation: missed video frames won't generate gaze data

garyfeng commented 9 years ago

In some of our studies we ran into issues where the SSA is much shorter and sporadic than the gaze data available in the HDF5 database. The problem has to do with how we construct the SSA file, which is by video frames. However, it appears that in some cases the video frames are irregular -- i.e., instead of 30 fps (or whatever we set to) there may be minutes without a key frame. For example:

Dialogue:0,0:08:50.16,0:08:50.19,Default,,0000,0000,0000,,{\pos(3026,1182)\an5}+
Dialogue:0,0:08:50.19,0:13:53.36,Default,,0000,0000,0000,,{\pos(3082,1207)\an5}+
Dialogue:0,0:13:53.36,0:13:53.39,Default,,0000,0000,0000,,{\pos(2600,1025)\an5}+
Dialogue:0,0:13:53.39,0:13:53.43,Default,,0000,0000,0000,,{\pos(2131,844)\an5}+
Dialogue:0,0:13:53.43,0:13:53.46,Default,,0000,0000,0000,,{\pos(1658,662)\an5}+

In another case we had 40 minutes of eye-tracking data but the last video frame detected by CV was at 24 minutes. The video will play in full in VLC, but no gaze after the 24 min mark because gaze data after the last frame did not get into SSA.

Since SSA is actually frame independent, we can simply generate the SSAs based on the fixations, regardless of the video frames.

garyfeng commented 9 years ago

Two thoughts on by-passing the video frames (BTW, getting the frames of a 40min long video can take several minutes):

1). Use the fixation events: generate SSA entries based directly on fixationStart and fixationEnd events. Clean and straightforward. We could even produce animations using AAS (http://docs.aegisub.org/3.1/ASS_Tags/), e.g., to enlarge the gaze cursor with fixation duration.

2). Use samples. We probably need to compress it, say averaging every 200msec or so. But if we can do #1, why this?

garyfeng commented 9 years ago

@isolver, how do we get fixationStart and fixationEnd events aligned? I don't see in the database an eventID or a fixationID to show which fixationStart and fixationEnd events are a pair. I can sort by eventID or the timestamp, but the risk is that things may get out of sync if there are intermediate events such as blinks or missing data.

garyfeng commented 9 years ago

Done. By default the script uses FixationStart to drive the gaze. If you want to use samples, change USE_FIXATIONS= false. Underlyingly we use different data tables for fixations or samples. But for fixations, we

don't scan the video file,
don't generate the video frame table
don't match samples to video frames So it's a lot faster (in seconds rather than minutes), and the visual looks a lot smoother.

The script is currently a hack, where for example I created fake frame_num and frame_time. Not the best practice but works for now. Will rewrite in the future.

EducationalTestingService / Confero

SSA generation: missed video frames won't generate gaze data #105