analyze_visual problem with empty tensors

GeorgeTouros commented 3 years ago

Hello! I am trying to extract features from this video

I am using the following command: python analyze_visual.py -f ../data/V236_915000__0.mp4 and I am getting the following result:

Using: cpu
Using cache found in /home/zappatistas20/.cache/torch/hub/NVIDIA_DeepLearningExamples_torchhub
Using cache found in /home/zappatistas20/.cache/torch/hub/NVIDIA_DeepLearningExamples_torchhub
Began processing video : ../data/V236_915000__0.mp4
FPS      = 23.976023976023978
Duration = 40.04 - 00:00:40.03
[W NNPACK.cpp:80] Could not initialize NNPACK! Reason: Unsupported hardware.
Traceback (most recent call last):
  File "analyze_visual.py", line 476, in <module>
    main(sys.argv)
  File "analyze_visual.py", line 454, in main
    save_results)
  File "analyze_visual.py", line 303, in process_video
    objects = generic_model.detect(frame, 0.1)
  File "/home/zappatistas20/PycharmProjects/multimodal_movie_analysis/analyze_visual/object_detection/generic_model.py", line 132, in detect
    results = self.utils.decode_results(detections_batch)
  File "/home/zappatistas20/.cache/torch/hub/NVIDIA_DeepLearningExamples_torchhub/hubconf.py", line 298, in decode_results
    results = encoder.decode_batch(ploc, plabel, criteria=0.5, max_output=20)
  File "/home/zappatistas20/.cache/torch/hub/NVIDIA_DeepLearningExamples_torchhub/PyTorch/Detection/SSD/src/utils.py", line 154, in decode_batch
    output.append(self.decode_single(bbox, prob, criteria, max_output))
  File "/home/zappatistas20/.cache/torch/hub/NVIDIA_DeepLearningExamples_torchhub/PyTorch/Detection/SSD/src/utils.py", line 197, in decode_single
    bboxes_out, labels_out, scores_out = torch.cat(bboxes_out, dim=0), \
RuntimeError: There were no tensor arguments to this function (e.g., you passed an empty list of Tensors), but no fallback function is registered for schema aten::_cat.  This usually means that this function requires a non-empty list of Tensors.  Available functions are [CPU, QuantizedCPU, BackendSelect, Named, AutogradOther, AutogradCPU, AutogradCUDA, AutogradXLA, AutogradPrivateUse1, AutogradPrivateUse2, AutogradPrivateUse3, Tracer, Autocast, Batched, VmapMode].

CPU: registered at /pytorch/build/aten/src/ATen/CPUType.cpp:2127 [kernel]
QuantizedCPU: registered at /pytorch/build/aten/src/ATen/QuantizedCPUType.cpp:297 [kernel]
BackendSelect: fallthrough registered at /pytorch/aten/src/ATen/core/BackendSelectFallbackKernel.cpp:3 [backend fallback]
Named: registered at /pytorch/aten/src/ATen/core/NamedRegistrations.cpp:7 [backend fallback]
AutogradOther: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:8078 [autograd kernel]
AutogradCPU: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:8078 [autograd kernel]
AutogradCUDA: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:8078 [autograd kernel]
AutogradXLA: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:8078 [autograd kernel]
AutogradPrivateUse1: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:8078 [autograd kernel]
AutogradPrivateUse2: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:8078 [autograd kernel]
AutogradPrivateUse3: registered at /pytorch/torch/csrc/autograd/generated/VariableType_2.cpp:8078 [autograd kernel]
Tracer: registered at /pytorch/torch/csrc/autograd/generated/TraceType_2.cpp:9654 [kernel]
Autocast: registered at /pytorch/aten/src/ATen/autocast_mode.cpp:258 [kernel]
Batched: registered at /pytorch/aten/src/ATen/BatchingRegistrations.cpp:511 [backend fallback]
VmapMode: fallthrough registered at /pytorch/aten/src/ATen/VmapModeRegistrations.cpp:33 [backend fallback]

Up to a point in the video, the process runs very smoothly. The same happens with a few other videos in my collection as well.

My guess is that there's an empty frame hidden somewhere in the file, and as a result an empty tensor is passed in the bboxes_out argument in decode_single. What would be the best way to try/catch this and return 0 or NaN, or just skip the frame completely, so that the process completes?

Thanks!

tyiannak commented 3 years ago

never seen that. @lobracost can you check as well?

tyiannak commented 3 years ago

I've checked and it works

analyze_visual|master⚡ ⇒ python3 analyze_visual.py -f ../V236_9150000.mp4 Using: cpu Using cache found in /Users/tyiannak/.cache/torch/hub/NVIDIA_DeepLearningExamples_torchhub Using cache found in /Users/tyiannak/.cache/torch/hub/NVIDIA_DeepLearningExamples_torchhub Began processing video : ../V236_915000__0.mp4 FPS = 23.976023976023978 Duration = 40.04 - 00:00:40.03 Finished processing on video :../V236_9150000.mp4 processing time: 00:03:09.44 processing ratio 1.1 fps processing ratio time 473 % Actual shape of feature matrix without object features: (200, 52) Shape of features' stats found: (208,) Number of shot changes: 19 Shape of feature matrix including object features (after smoothing object confidences): (198, 88) Shape of feature stats vector including object features (after smoothing object confidences): (244,)

pakoromilas commented 3 years ago

@GeorgeTouros This is a known problem for the Nvidia's hub repo. I managed to fix it by editing a file from the repo. This manipulation is done only at the first time when the code is downloaded. Specifically these are the lines I wrote https://github.com/tyiannak/multimodal_movie_analysis/blob/cd91680daf675fa80f767885231951f9ec0943f7/analyze_visual/object_detection/generic_model.py#L52-L105

I tried it many times in the past and conclude that it works well for empty frames (I tried empty videos too). So I suggest you to remove the nvidia code (i.e. erase the contents of the directory home/zappatistas20/.cache/torch/) and run the script again. This way Nvidia's code will be fixed from our script.

Tell me if this works.

GeorgeTouros commented 3 years ago

@lobracost do I have to pull again from master before I do this check?

pakoromilas commented 3 years ago

@lobracost do I have to pull again from master before I do this check?

No I think it's fine if you don't pull.

GeorgeTouros commented 3 years ago

Thanks @lobracost works fine after following your instructions. This NVIDIA model has been quite a pleasant experience to work with :stuck_out_tongue_closed_eyes:

tyiannak / multimodal_movie_analysis

analyze_visual problem with empty tensors #30