wjf5203 / VNext

Next-generation Video instance recognition framework on top of Detectron2 which supports InstMove (CVPR 2023), SeqFormer(ECCV Oral), and IDOL(ECCV Oral))
Apache License 2.0
603 stars 53 forks source link

Use so much memory during inference #27

Closed f414158949 closed 2 years ago

f414158949 commented 2 years ago

when I try to inference a long video dataset (about 200 frames per video) on a hardware with 256G memory, i meet the error and get crash: DefaultCPUAllocator: can't allocate memory: you tried to allocate 522746265600 bytes. Error code 12 (Cannot allocate memory)

is there a way to generate the result json file with a smaller memory?

f414158949 commented 2 years ago

Traceback (most recent call last): File "projects/IDOL/train_net.py", line 193, in launch( File "VNext-main/detectron2/engine/launch.py", line 82, in launch main_func(args) File "projects/IDOL/train_net.py", line 180, in main res = Trainer.test(cfg, model) File "VNext-main/detectron2/engine/defaults.py", line 617, in test results_i = inference_on_dataset(model, data_loader, evaluator) File "VNext-main/detectron2/evaluation/evaluator.py", line 158, in inference_on_dataset outputs = model(inputs) File "/home/fll/anaconda3/envs/vnext/lib/python3.8/site-packages/torch/nn/modules/module.py", line 1110, in _call_impl return forward_call(input, **kwargs) File "VNext-main/projects/IDOL/idol/idol.py", line 291, in forward video_output = self.inference(output, idol_tracker, (height, width), images.image_sizes[0]) # (height, width) is resized size,images. image_sizes[0] is original size File "VNext-main/projects/IDOL/idol/idol.py", line 459, in inference pred_masks = F.interpolate(pred_masks, size=(ori_size[0], ori_size[1]), mode='nearest') File "/home/fll/anaconda3/envs/vnext/lib/python3.8/site-packages/torch/nn/functional.py", line 3891, in interpolate return torch._C._nn.upsample_nearest2d(input, output_size, scale_factors) RuntimeError: [enforce fail at alloc_cpu.cpp:73] . DefaultCPUAllocator: can't allocate memory: you tried to allocate 334349107200 bytes. Error code 12 (Cannot allocate memory)

wjf5203 commented 2 years ago

Hi, thanks for your attention.

The reason for this is that OVIS videos are too long, for convenience, when the object disappears, we added zero masks to the corresponding frame to maintain the integrity of the mask sequence. This causes a lot of wasted memory.

We have fixed this bug and the code can now perform inference on OVIS on a machine within 200G memory.