Open cdjkim opened 6 years ago
Very relevant question ! I'm trying to fine-tune the 'flow_imagenet' pre-trained model on a dataset of video clips but it's unclear what do these two channels refer to.
What exactly do u
and v
stand for in this case? It is mentioned in the OpenCV class reference for the OpticalFlowTVL1 implementation and in the paper about this implementation that I skimmed through, but what it actually stands for is quite elusive. However, in an OpenCV Tutorial it is explained as:
It is 2D vector field where each vector is a displacement vector showing the movement of points from first frame to second.
And consequently it is literally defined as u=dx/dt; v=dy/dt
.
Now we only have to assume that the OpticalFlow implementation of opencv expects an image to be of dimensions (x,y,rgb) and not the other way round.
Since it is two-dimensional, in accordance with the above, I would expect it to be the output of DualTVL1OpticalFlow_create(), hence u and v, right?
edit: I just read again the instructions for this repo and it is indeed quite confusing as the authors talk about "only taking the first two dimensions". However, since they talk about "pixel values" and truncate them to lie in [-20; 20] it doesn't make sense, in my opinion, if they were talking about rgb colors to truncate them in this way. Also I found other people saying they achieved comparable results by using DualTVL1OpticalFlow_create() on greyscale images and subsequent truncating which is kind of in line with the instructions.
So to summarize: I think the optical flow is not turned into an rgb image.
Hi, I have a question regarding the explanation of the optical flow used. The git page states,
We only use the first two output dimensions, and apply the same cropping as for RGB. The provided .npy file thus has shape (1, num_frames, 224, 224, 2)
However, I was wondering what this is referring to exactly. Is this the stack of
u
andv
, the output of the TVL1?(if that is the case, just wondering in what order?) Or do you make it into a rgb image and use just the rg ?This was a little unclear for me, thanks.