chaoyuaw / pytorch-coviar

Compressed Video Action Recognition
https://www.cs.utexas.edu/~cywu/projects/coviar/
GNU Lesser General Public License v2.1
502 stars 126 forks source link

Decoding a video frame, given the previous frame, current motion vectors, and current residual image. #37

Closed itoen220 closed 5 years ago

itoen220 commented 5 years ago

Assume we have _postarget=t, a reference frame at _postarget=t-1, and the motion vectors and residual image for the given _postarget=t. However, assume we don't have the original video file.

Given these constraints, I would like to reconstruct the frame at _postarget=t, as described in Equation 1 of your paper.

So far, I've tried decoding the frame at _postarget=t by: (1) creating a reference frame, which is just a copy of the t-1 frame; (2) performing motion compensation by copying 16x16 pixel blocks from the t-1 frame to the reference frame, based on the motion vectors; (3) adding the residual image to the motion-compensated reference frame.

This is the reference frame at _postarget=2: image

This is the result after step (1), for _postarget=3: image

This is the result after step (2), for _postarget=3: image

The final result seems to have some compression artifacts, so I guess I'm not reconstructing the frame correctly. Is there a better way to do this (particularly, using ffmpeg)? Thanks!

ffmpbgrnn commented 5 years ago

Hi @itoen220 , do you have any updates on better reconstruction?

AlexSte803 commented 4 years ago

@itoen220 Did you solve it? I have the same problem.

shencuifeng commented 2 years ago

@itoen220 Did you solve it? I have the same problem.

did you reconstruct it?