pytorch / vision

Datasets, Transforms and Models specific to Computer Vision
https://pytorch.org/vision
BSD 3-Clause "New" or "Revised" License
15.98k stars 6.92k forks source link

Video reader segfaults on certain videos. Here's a partial-reproduction script #6802

Open jdsgomes opened 1 year ago

jdsgomes commented 1 year ago

Follow up on the following observation by @vedantroy :

This segfaults on certain videos. Here's a partial-reproduction script (the videos are stored in a pandas dataframe):

import pandas as pd

# df2 = pd.read_pickle("df2.pkl")
df = pd.read_pickle("df.pkl")
# print the # of rows in the df
print(len(df))
# print the keys in the df
print(df.keys())
# print the first row in the df
print(df.iloc[0])

# print the type of the 1st value in the 1st row
first_vid = df.iloc[0][0]
print(f"Length: {len(first_vid)}")
print(f"Type: {type(first_vid)}")

# write first_vid to a file
with open("test.mp4", "wb") as f:
    f.write(first_vid)

import itertools
import copy

import torch
from torchvision.io import VideoReader
import torchvision

def clip_from_start(buf: bytes, expected_frames: int):
    # import av
    # import io
    # buffer = io.BytesIO(buf)
    # container = av.open(buffer)
    # i = 0 
    # for frame in container.decode(video=0):
    #     print(type(frame))
    #     i += 1
    #     print(i)
    #     pass

    tensor = torch.frombuffer(buf, dtype=torch.uint8)
    tensor = copy.deepcopy(tensor)
    # torchvision.io.read_video()
    rdr = VideoReader(tensor)
    sampled_frames = list(itertools.islice(iter(rdr), expected_frames))
    if len(sampled_frames) != expected_frames:
        return None
    data = []
    for frame in sampled_frames:
        data.append(frame["data"])
    return torch.stack(data, dim=0)

clip = clip_from_start(first_vid, 2)
print(clip.shape)

I'm working to get approval of the public copy of the data.

In the meantime, the error is:

test.py:41: UserWarning: The given buffer is not writable, and PyTorch does not support non-writable tensors. This means you can write to the underlying (supposedly non-writable) buffer using the tensor. You may want to copy the buffer to protect its data or make it writable before converting it to a tensor. This type of warning will be suppressed for the rest of this program. (Triggered internally at ../torch/csrc/utils/tensor_new.cpp:1563.)
  tensor = torch.frombuffer(buf, dtype=torch.uint8)
malloc(): corrupted top size
Aborted (core dumped)

Originally posted by @vedantroy in https://github.com/pytorch/vision/issues/6771#issuecomment-1283644752

vedantroy commented 1 year ago

I'll try to get a public copy of the data soon, as I know this script is pretty useless without it. The video file is valid (I think), since the VsCode MP4 previewer can play the MP4 file and ffprobe works on it.

I suspect that it's some FFMPEG-torchvision edgecase. Specifically, the video file is a short clip from a longer video, and I did the clipping using FFMPEG.