jy0205 / LaVIT

LaVIT: Empower the Large Language Model to Understand and Generate Visual Content
Other
438 stars 22 forks source link

Request for Alternative to mvextractor Incompatible with CentOS #23

Closed patrick-tssn closed 6 hours ago

patrick-tssn commented 2 months ago

I appreciate your excellent work. I've run into an issue with mvextractor; it's not compatible with CentOS (details). Could you recommend an alternative tool for flow extraction that supports CentOS? Thank you for your assistance.

jy0205 commented 2 months ago

Oops, we also don't have a good alternative for extracting motion vectors in CentOS.

xizaoqu commented 2 months ago

Oops, we also don't have a good alternative for extracting motion vectors in CentOS.

Hi, can you provide some scripts to visualize and verify the results of extracted motion vectors? In my attempts, the extracted motions are kind of noisy, which may affect the results.

patrick-tssn commented 2 months ago

Oops, we also don't have a good alternative for extracting motion vectors in CentOS.

Hi, can you provide some scripts to visualize and verify the results of extracted motion vectors? In my attempts, the extracted motions are kind of noisy, which may affect the results.

Hi, I hope this could be helpful: https://github.com/bigai-nlco/LSTP-Chat/blob/main/demo/utils/builder_utils.py please refer to the flow_to_image function.

xizaoqu commented 1 month ago

Hi, I hope this could be helpful: https://github.com/bigai-nlco/LSTP-Chat/blob/main/demo/utils/builder_utils.py please refer to the flow_to_image function.

Thanks~

liyz15 commented 9 hours ago

Here is an alternative for VideoCap in pyav

import av

class VideoCap:
    def __init__(self):
        self.container = None

    def open(self, video_path):
        try:
            self.container = av.open(video_path)
            stream = self.container.streams.video[0]
            codec_context = stream.codec_context
            codec_context.export_mvs = True
            return True
        except:
            return False

    def _process_motion_vectors(self, mvs):
        if mvs is None:
            return np.empty((0, 10), dtype=int)
        else:
            return np.array([[
                mv.source, mv.w, mv.h, mv.src_x, mv.src_y,
                mv.dst_x, mv.dst_y, mv.motion_x, mv.motion_y, mv.motion_scale
            ] for mv in mvs])

    def _iter_frames(self):
        for packet in self.container.demux(video=0):
            for video_frame in packet.decode():
                frame_rgb = video_frame.to_rgb().to_ndarray()
                frame_rgb = frame_rgb[:, :, ::-1]  # To match the original cv2 impl

                motion_vectors_raw = video_frame.side_data.get('MOTION_VECTORS')
                motion_vectors = self._process_motion_vectors(motion_vectors_raw)

                frame_type = video_frame.pict_type
                timestamp = float(video_frame.pts * video_frame.time_base)

                yield frame_rgb, motion_vectors, frame_type, timestamp

    def read(self):
        try:
            frame_rgb, motion_vectors_raw, frame_type, timestamp = next(self._iter_frames())
            return True, frame_rgb, motion_vectors_raw, frame_type, timestamp
        except StopIteration:
            return False, None, None, None, None
patrick-tssn commented 6 hours ago

Thank you, @liyz15, for your resolution!