facebookresearch / detectron2

Detectron2 is a platform for object detection, segmentation and other visual recognition tasks.
https://detectron2.readthedocs.io/en/latest/
Apache License 2.0
30.55k stars 7.49k forks source link

Trying to create tensor with negative dimension Error in RPN #1306

Closed sunshuofeng closed 4 years ago

sunshuofeng commented 4 years ago

In the get_ground_truth of the RRPN_output section, this error occurred when calculating the iou.

Traceback (most recent call last):
  File "main.py", line 72, in <module>
    loss_dict = model(data)
  File "/root/anaconda3/envs/python367/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/detectron2/detectron2/modeling/meta_arch/rcnn.py", line 124, in forward
    proposals, proposal_losses = self.proposal_generator(images, features, gt_instances)
  File "/root/anaconda3/envs/python367/lib/python3.6/site-packages/torch/nn/modules/module.py", line 532, in __call__
    result = self.forward(*input, **kwargs)
  File "/root/yaogan/custom/kaggle_rpn.py", line 172, in forward
    losses = outputs.losses()
  File "/root/detectron2/detectron2/modeling/proposal_generator/rpn_outputs.py", line 333, in losses
    gt_objectness_logits, gt_anchor_deltas = self._get_ground_truth()
  File "/root/yaogan/custom/kaggle_rpn.py", line 109, in _get_ground_truth
    match_quality_matrix = retry_if_cuda_oom(my_pairwise_iou)(gt_boxes_i, anchors_i)
  File "/root/detectron2/detectron2/utils/memory.py", line 72, in wrapped
    return func(*args, **kwargs)
  File "/root/yaogan/custom/kaggle_rpn.py", line 30, in my_pairwise_iou
    return pairwise_iou_rotated(boxes1.tensor, boxes2.tensor)
  File "/root/detectron2/detectron2/layers/rotated_boxes.py", line 23, in pairwise_iou_rotated
    return _C.box_iou_rotated(boxes1, boxes2)
RuntimeError: Trying to create tensor with negative dimension -1678118656: [-1678118656] (check_size_nonnegative at /pytorch/aten/src/ATen/native/TensorFactories.h:64)
frame #0: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x33 (0x7f5222222193 in /root/anaconda3/envs/python367/lib/python3.6/site-packages/torch/lib/libc10.so)
frame #1: at::native::empty_cuda(c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>) + 0xb61 (0x7f5147dd5971 in /root/anaconda3/envs/python367/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #2: <unknown function> + 0x455b8d8 (0x7f51466a58d8 in /root/anaconda3/envs/python367/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #3: <unknown function> + 0x1eedc47 (0x7f5144037c47 in /root/anaconda3/envs/python367/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #4: <unknown function> + 0x3ead8a5 (0x7f5145ff78a5 in /root/anaconda3/envs/python367/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #5: <unknown function> + 0x1eedc47 (0x7f5144037c47 in /root/anaconda3/envs/python367/lib/python3.6/site-packages/torch/lib/libtorch.so)
frame #6: at::Tensor c10::KernelFunction::callUnboxedOnly<at::Tensor, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat> >(c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>) const + 0xd1 (0x7f520ad0c9ed in /root/detectron2/detectron2/_C.cpython-36m-x86_64-linux-gnu.so)
frame #7: at::Tensor c10::Dispatcher::doCallUnboxedOnly<at::Tensor, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat> >(c10::DispatchTable const&, c10::LeftRight<ska::flat_hash_map<c10::TensorTypeId, c10::KernelFunction, std::hash<c10::TensorTypeId>, std::equal_to<c10::TensorTypeId>, std::allocator<std::pair<c10::TensorTypeId, c10::KernelFunction> > > > const&, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>) const::{lambda(ska::flat_hash_map<c10::TensorTypeId, c10::KernelFunction, std::hash<c10::TensorTypeId>, std::equal_to<c10::TensorTypeId>, std::allocator<std::pair<c10::TensorTypeId, c10::KernelFunction> > > const&)#1}::operator()(ska::flat_hash_map<c10::TensorTypeId, c10::KernelFunction, std::hash<c10::TensorTypeId>, std::equal_to<c10::TensorTypeId>, std::allocator<std::pair<c10::TensorTypeId, c10::KernelFunction> > > const&) const + 0x101 (0x7f520ad0a8f1 in /root/detectron2/detectron2/_C.cpython-36m-x86_64-linux-gnu.so)
frame #8: std::result_of<at::Tensor c10::Dispatcher::doCallUnboxedOnly<at::Tensor, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat> >(c10::DispatchTable const&, c10::LeftRight<ska::flat_hash_map<c10::TensorTypeId, c10::KernelFunction, std::hash<c10::TensorTypeId>, std::equal_to<c10::TensorTypeId>, std::allocator<std::pair<c10::TensorTypeId, c10::KernelFunction> > > > const&, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>) const::{lambda(ska::flat_hash_map<c10::TensorTypeId, c10::KernelFunction, std::hash<c10::TensorTypeId>, std::equal_to<c10::TensorTypeId>, std::allocator<std::pair<c10::TensorTypeId, c10::KernelFunction> > > const&)#1} (ska::flat_hash_map<c10::TensorTypeId, c10::KernelFunction, std::hash<c10::TensorTypeId>, std::equal_to<c10::TensorTypeId>, std::allocator<std::pair<c10::TensorTypeId, c10::KernelFunction> > > const&)>::type c10::LeftRight<ska::flat_hash_map<c10::TensorTypeId, c10::KernelFunction, std::hash<c10::TensorTypeId>, std::equal_to<c10::TensorTypeId>, std::allocator<std::pair<c10::TensorTypeId, c10::KernelFunction> > > >::read<at::Tensor c10::Dispatcher::doCallUnboxedOnly<at::Tensor, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat> >(c10::DispatchTable const&, c10::LeftRight<ska::flat_hash_map<c10::TensorTypeId, c10::KernelFunction, std::hash<c10::TensorTypeId>, std::equal_to<c10::TensorTypeId>, std::allocator<std::pair<c10::TensorTypeId, c10::KernelFunction> > > > const&, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>) const::{lambda(ska::flat_hash_map<c10::TensorTypeId, c10::KernelFunction, std::hash<c10::TensorTypeId>, std::equal_to<c10::TensorTypeId>, std::allocator<std::pair<c10::TensorTypeId, c10::KernelFunction> > > const&)#1}>(at::Tensor c10::Dispatcher::doCallUnboxedOnly<at::Tensor, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat> >(c10::DispatchTable const&, c10::LeftRight<ska::flat_hash_map<c10::TensorTypeId, c10::KernelFunction, std::hash<c10::TensorTypeId>, std::equal_to<c10::TensorTypeId>, std::allocator<std::pair<c10::TensorTypeId, c10::KernelFunction> > > > const&, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>) const::{lambda(ska::flat_hash_map<c10::TensorTypeId, c10::KernelFunction, std::hash<c10::TensorTypeId>, std::equal_to<c10::TensorTypeId>, std::allocator<std::pair<c10::TensorTypeId, c10::KernelFunction> > > const&)#1}&&) const + 0x128 (0x7f520ad0cc4c in /root/detectron2/detectron2/_C.cpython-36m-x86_64-linux-gnu.so)
frame #9: at::Tensor c10::Dispatcher::doCallUnboxedOnly<at::Tensor, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat> >(c10::DispatchTable const&, c10::LeftRight<ska::flat_hash_map<c10::TensorTypeId, c10::KernelFunction, std::hash<c10::TensorTypeId>, std::equal_to<c10::TensorTypeId>, std::allocator<std::pair<c10::TensorTypeId, c10::KernelFunction> > > > const&, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>) const + 0x8d (0x7f520ad0a9a1 in /root/detectron2/detectron2/_C.cpython-36m-x86_64-linux-gnu.so)
frame #10: at::Tensor c10::Dispatcher::callUnboxedOnly<at::Tensor, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat> >(c10::OperatorHandle const&, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>) const::{lambda(c10::DispatchTable const&)#1}::operator()(c10::DispatchTable const&) const + 0xa7 (0x7f520ad084a9 in /root/detectron2/detectron2/_C.cpython-36m-x86_64-linux-gnu.so)
frame #11: std::result_of<at::Tensor c10::Dispatcher::callUnboxedOnly<at::Tensor, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat> >(c10::OperatorHandle const&, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>) const::{lambda(c10::DispatchTable const&)#1} (c10::DispatchTable const&)>::type c10::LeftRight<c10::DispatchTable>::read<at::Tensor c10::Dispatcher::callUnboxedOnly<at::Tensor, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat> >(c10::OperatorHandle const&, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>) const::{lambda(c10::DispatchTable const&)#1}>(at::Tensor c10::Dispatcher::callUnboxedOnly<at::Tensor, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat> >(c10::OperatorHandle const&, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>) const::{lambda(c10::DispatchTable const&)#1}&&) const + 0x128 (0x7f520ad0cdd0 in /root/detectron2/detectron2/_C.cpython-36m-x86_64-linux-gnu.so)
frame #12: c10::guts::infer_function_traits<at::Tensor c10::Dispatcher::callUnboxedOnly<at::Tensor, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat> >(c10::OperatorHandle const&, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>) const::{lambda(c10::DispatchTable const&)#1}>::type::return_type c10::impl::OperatorEntry::readDispatchTable<at::Tensor c10::Dispatcher::callUnboxedOnly<at::Tensor, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat> >(c10::OperatorHandle const&, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>) const::{lambda(c10::DispatchTable const&)#1}>(at::Tensor c10::Dispatcher::callUnboxedOnly<at::Tensor, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat> >(c10::OperatorHandle const&, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>) const::{lambda(c10::DispatchTable const&)#1}&&) const + 0x49 (0x7f520ad0aa13 in /root/detectron2/detectron2/_C.cpython-36m-x86_64-linux-gnu.so)
frame #13: at::Tensor c10::Dispatcher::callUnboxedOnly<at::Tensor, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat> >(c10::OperatorHandle const&, c10::ArrayRef<long>, c10::TensorOptions const&, c10::optional<c10::MemoryFormat>) const + 0x99 (0x7f520ad08565 in /root/detectron2/detectron2/_C.cpython-36m-x86_64-linux-gnu.so)
frame #14: <unknown function> + 0xb2dd9 (0x7f520ad15dd9 in /root/detectron2/detectron2/_C.cpython-36m-x86_64-linux-gnu.so)
frame #15: detectron2::box_iou_rotated_cuda(at::Tensor const&, at::Tensor const&) + 0x370 (0x7f520ad161da in /root/detectron2/detectron2/_C.cpython-36m-x86_64-linux-gnu.so)
frame #16: detectron2::box_iou_rotated(at::Tensor const&, at::Tensor const&) + 0x65 (0x7f520acbccf5 in /root/detectron2/detectron2/_C.cpython-36m-x86_64-linux-gnu.so)
frame #17: <unknown function> + 0x5df31 (0x7f520acc0f31 in /root/detectron2/detectron2/_C.cpython-36m-x86_64-linux-gnu.so)
frame #18: <unknown function> + 0x66530 (0x7f520acc9530 in /root/detectron2/detectron2/_C.cpython-36m-x86_64-linux-gnu.so)

Instructions To Reproduce the Issue:

I saved the two boxes that reported the error Since the anchor file was too large, I saved it to kaggle, where I could download the data: kaggle file

you can use this code to load it

with open('gt_box.txt','rb') as f:
    gt=pickle.load(f)
with open('anchor.txt','rb') as f:
    anchor=pickle.load(f)

From these data, I found that the target box is relatively small and has overlapping parts I wonder if the problem is caused by the small and overlapping target boxes?

Environment:

PyTorch built with:

GCC 7.3
Intel(R) Math Kernel Library Version 2019.0.4 Product Build 20190411 for Intel(R) 64 architecture applications
Intel(R) MKL-DNN v0.21.1 (Git Hash 7d2fd500bc78936d1d648ca713b901012f470dbc)
OpenMP 201511 (a.k.a. OpenMP 4.5)
NNPACK is enabled
CUDA Runtime 10.0
NVCC architecture flags: -gencode;arch=compute_37,code=sm_37;-gencode;arch=compute_50,code=sm_50;-gencode;arch=compute_60,code=sm_60;-gencode;arch=compute_61,code=sm_61;-gencode;arch=compute_70,code=sm_70;-gencode;arch=compute_75,code=sm_75;-gencode;arch=compute_37,code=compute_37
CuDNN 7.6.3
Magma 2.5.1
Build settings: BLAS=MKL, BUILD_NAMEDTENSOR=OFF, BUILD_TYPE=Release, CXX_FLAGS= -Wno-deprecated -fvisibility-inlines-hidden -fopenmp -DUSE_FBGEMM -DUSE_QNNPACK -DUSE_PYTORCH_QNNPACK -O2 -fPIC -Wno-narrowing -Wall -Wextra -Wno-missing-field-initializers -Wno-type-limits -Wno-array-bounds -Wno-unknown-pragmas -Wno-sign-compare -Wno-unused-parameter -Wno-unused-variable -Wno-unused-function -Wno-unused-result -Wno-strict-overflow -Wno-strict-aliasing -Wno-error=deprecated-declarations -Wno-stringop-overflow -Wno-error=pedantic -Wno-error=redundant-decls -Wno-error=old-style-cast -fdiagnostics-color=always -faligned-new -Wno-unused-but-set-variable -Wno-maybe-uninitialized -fno-math-errno -fno-trapping-math -Wno-stringop-overflow, DISABLE_NUMA=1, PERF_WITH_AVX=1, PERF_WITH_AVX2=1, PERF_WITH_AVX512=1, USE_CUDA=ON, USE_EXCEPTION_PTR=1, USE_GFLAGS=OFF, USE_GLOG=OFF, USE_MKL=ON, USE_MKLDNN=ON, USE_MPI=OFF, USE_NCCL=ON, USE_NNPACK=ON, USE_OPENMP=ON, USE_STATIC_DISPATCH=OFF,
ppwwyyxx commented 4 years ago

There are too many anchors that the size of the matching matrix exceeds that of int32. We should throw a better error in this case.