georgmartius / vid.stab

Video stabilization library
http://public.hronopik.de/vid.stab/
Other
859 stars 109 forks source link

Question: understanding the transform vector values? #62

Open ggdupont opened 6 years ago

ggdupont commented 6 years ago

Hi, I'm trying to process head mounted eye tracking video where the movement head influence the video field of view but the eye tracking is actually relative to the field of view. I intended to use video stabilization to ease object detection and tracking. Then the eye tracking coordinates needs also to be "stabilized" accordingly. I extracted the transform vector but can't really make sense of the values:

Frame 2 (List 98 [(LM 0 -8 787 199 112 0.665356 2.596540),(LM -2 1 493 479 112 0.687232 1.848852),(LM -2 1 640 479 112 0.715156 1.064573...

It feels like there are coordinates of polygons + transformations vectors, but the values are not clear. Any details on the format anywhere?

georgmartius commented 6 years ago

Hi,

the information in the transform files are the detected movements in different positions of the frame. The datatype is LocalMotion, see https://github.com/georgmartius/vid.stab/blob/master/src/transformtype.h

The code for the serialization is here: https://github.com/georgmartius/vid.stab/blob/master/src/serialize.c

However, if you know already the global transformation of each subsequent frame (or globally), you can feed them directly using the "old" format, which is one line per frame etc. See https://github.com/georgmartius/vid.stab/blob/master/src/serialize.c#L190. There is some documentation of the format, it is very simple.

The plugin will try to read the new format and if it fails it will use the old one.

A short info in the transformation. You can specify whether the transformations you provide are global (meaning references to the first frame) or local (movements of subsequent frames) using the relative option. Let me know how things go.

ForSerious commented 2 years ago

I have a movie that consistently shakes side to side a few pixels throughout. Just want to make sure before I go making a custom transform file incorrectly. The old format would be something like: #this comment will be ignored 1,1,0,0,0 2,1,0,0,0 Where the first digit is frame number, then x being left right motion in pixels, then y being up down in pixels, then rotation, then unused extra? For the script I want to make, I only care about frame and left right motion. I'm asking about the old format because it's simpler, but if you can give me an example of how to do it in the new format, that's fine too.

ForSerious commented 2 years ago

I figured it out. The old format looks like this: #Ignored comment #Frame, x, y, alpha, (optional) zoom, extra 1 0.5 -0.5 0.0 0 2 1.0 1.0 0.0 0