amazon-science / long-short-term-transformer

[NeurIPS 2021 Spotlight] Official implementation of Long Short-Term Transformer for Online Action Detection
Apache License 2.0
127 stars 19 forks source link

How to do optical flow data preprocessing before sending to the bninception net? #12

Closed Prot-debug closed 2 years ago

Prot-debug commented 2 years ago

@xumingze0308 Hi Xu,

I followed the url https://github.com/yjxiong/action-detection/blob/master/transforms.py that you mentionded in other issues. But I found that this data preprocessing is so inefficient, that is, processing one .jpg files frame by frame. And it's easy to cause memory leak problems by using the function PIL.Image.open(). I don't know where I am going wrong, my code process is as follows.

def transforms_img(img_list):

    trans = torchvision.transforms.Compose([
        GroupScale(256),
        GroupRandomCrop(224),
        Stack(),
        ToTorchFormatTensor(),
        GroupNormalize(
            mean=[128],
            std=[128]
        )]
    )

    for i, img_dir in enumerate(img_list):
        with open(img_dir, 'rb') as open_file:
            img = Image.open(open_file).convert('L')
            color_group = [img]
            rst = trans(color_group)
            del color_group
            del img
        if i == 0:
            stack_img = rst
            del rst
        else:
            stack_img = torch.cat((stack_img, rst), dim=0)
            del rst
    gc.collect() 
xumingze0308 commented 2 years ago

You could directly use the features and targets here: https://github.com/zhaoyue-zephyrus/TeSTra. Please check our latest readme for details.