jbwang1997 / OBBDetection

OBBDetection is an oriented object detection library, which is based on MMdetection.
Apache License 2.0
537 stars 113 forks source link

convex_ext #24

Closed mmoghadam11 closed 3 years ago

mmoghadam11 commented 3 years ago

hi tnx for implementation im using forked orginal ReDet i need add Poly IoU Loss but in adding ext i dont know how to that in colab cell i put this:

 #!/usr/bin/env python
import os
import subprocess
import time
# from setuptools import find_packages, setup

import torch
from torch.utils.cpp_extension import (BuildExtension, CppExtension,
                                       CUDAExtension)

 def make_cuda_ext(name, module, sources, sources_cuda=[]):

    define_macros = []
    extra_compile_args = {'cxx': []}

    if torch.cuda.is_available() or os.getenv('FORCE_CUDA', '0') == '1':
        define_macros += [('WITH_CUDA', None)]
        extension = CUDAExtension
        extra_compile_args['nvcc'] = [
            '-D__CUDA_NO_HALF_OPERATORS__',
            '-D__CUDA_NO_HALF_CONVERSIONS__',
            '-D__CUDA_NO_HALF2_OPERATORS__',
        ]
        sources += sources_cuda
    else:
        print(f'Compiling {name} without CUDA')
        extension = CppExtension
        # raise EnvironmentError('CUDA is required to compile MMDetection!')

    return extension(
        name=f'{module}.{name}',
        sources=[os.path.join(*module.split('.'), p) for p in sources],
        define_macros=define_macros,
        extra_compile_args=extra_compile_args)

  setup(
      ext_modules=[
                      make_cuda_ext(
                        name='convex_ext',
                        module='mmdet.ops.convex',
                        sources=[
                            'src/convex_cpu.cpp',
                            'src/convex_ext.cpp'
                        ],
                        sources_cuda=['src/convex_cuda.cu']),
               ],
        cmdclass={'build_ext': BuildExtension},
        zip_safe=False
        )

and got err:

File "<ipython-input-44-a399c1bfcad4>", line 38
    setup(
          ^
IndentationError: unindent does not match any outer indentation level

i want add it seperatly what should i do??

mmoghadam11 commented 3 years ago

the orginal setup is a compile.sh

#!/usr/bin/env bash

PYTHON=${PYTHON:-"python3"}

echo "Building roi align op..."
cd mmdet/ops/roi_align
if [ -d "build" ]; then
    rm -r build
fi
$PYTHON setup.py build_ext --inplace

echo "Building roi pool op..."
cd ../roi_pool
if [ -d "build" ]; then
    rm -r build
fi
$PYTHON setup.py build_ext --inplace

echo "Building roi align rotated op..."
cd ../roi_align_rotated
if [ -d "build" ]; then
    rm -r build
fi
$PYTHON setup.py build_ext --inplace

echo "Building riroi align op..."
cd ../riroi_align
if [ -d "build" ]; then
    rm -r build
fi
$PYTHON setup.py build_ext --inplace

echo "Building ps roi align rotated op..."
cd ../psroi_align_rotated
if [ -d "build" ]; then
    rm -r build
fi
$PYTHON setup.py build_ext --inplace

echo "Building nms op..."
cd ../nms
if [ -d "build" ]; then
    rm -r build
fi
$PYTHON setup.py build_ext --inplace

echo "Building dcn..."
cd ../dcn
if [ -d "build" ]; then
    rm -r build
fi
$PYTHON setup.py build_ext --inplace

echo "Building sigmoid focal loss op..."
cd ../sigmoid_focal_loss
if [ -d "build" ]; then
    rm -r build
fi
$PYTHON setup.py build_ext --inplace

echo "Building masked conv op..."
cd ../masked_conv
if [ -d "build" ]; then
    rm -r build
fi
$PYTHON setup.py build_ext --inplace

echo "Building poly_nms op..."
cd ../poly_nms
if [ -d "build" ]; then
    rm -r build
fi
$PYTHON setup.py build_ext --inplace

echo "Building cpu_nms..."
cd ../../core/bbox
$PYTHON setup_linux.py build_ext --inplace

how can i add it to this?? pls help

jbwang1997 commented 3 years ago

If the boxes you want to calculate IoUs are rotated boxes, I recommend directly using the original implement at here.

The PolyIoULoss is an extension of the original RotatedIoULoss. In the case of rotated boxes, they are the same.

mmoghadam11 commented 3 years ago

If the boxes you want to calculate IoUs are rotated boxes, I recommend directly using the original implement at here.

The PolyIoULoss is an extension of the original RotatedIoULoss. In the case of rotated boxes, they are the same.

but i want use it in sth like your ripo and mmdetection repo take a look pls

how can i add it in setup.py

jbwang1997 commented 3 years ago

Hi, I think you can imitate the setup.py in your repo's roialign. You just need to replace the .cpp and .cu files. Then, add the setup command in the compile.sh.

mmoghadam11 commented 3 years ago

Hi, I think you can imitate the setup.py in your repo's roialign. You just need to replace the .cpp and .cu files. Then, add the setup command in the compile.sh.

i made a setup.py

like this ;

from setuptools import setup
from torch.utils.cpp_extension import BuildExtension, CUDAExtension

setup(
    name='riroi_align_cuda',
    ext_modules=[
        CUDAExtension('convex_ext', [
            'src/convex_cpu.cpp',
            'src/convex_ext.cpp',
        ],
        sources_cuda=['src/convex_cuda.cu']
        ),
    ],
    cmdclass={'build_ext': BuildExtension})

and add it in compile.sh like this;

echo "Building convex op..."
cd ../convex
if [ -d "build" ]; then
    rm -r build
fi
$PYTHON setup.py build_ext --inplace

is it true??

mmoghadam11 commented 3 years ago

i got this err;

File "/content/ReDet/mmdet/ops/convex/convex_wrapper.py", line 12, in forward
    idx = convex_ext.convex_sort(pts, masks, circular)
RuntimeError: sort_vert is not compiled with GPU support (convex_sort at src/convex_ext.cpp:19)
frame #0: std::function<std::string ()>::operator()() const + 0x11 (0x7f3ea1d42441 in /usr/local/lib/python3.7/dist-packages/torch/lib/libc10.so)
frame #1: c10::Error::Error(c10::SourceLocation, std::string const&) + 0x2a (0x7f3ea1d41d7a in /usr/local/lib/python3.7/dist-packages/torch/lib/libc10.so)
frame #2: convex_sort(at::Tensor const&, at::Tensor const&, bool) + 0x2a4 (0x7f3e9c6311b4 in /content/ReDet/mmdet/ops/convex/convex_ext.cpython-37m-x86_64-linux-gnu.so)
frame #3: <unknown function> + 0xf5c5 (0x7f3e9c6355c5 in /content/ReDet/mmdet/ops/convex/convex_ext.cpython-37m-x86_64-linux-gnu.so)
frame #4: <unknown function> + 0x12f11 (0x7f3e9c638f11 in /content/ReDet/mmdet/ops/convex/convex_ext.cpython-37m-x86_64-linux-gnu.so)
<omitting python frames>
frame #9: THPFunction_apply(_object*, _object*) + 0x6b1 (0x7f3ee12c5301 in /usr/local/lib/python3.7/dist-packages/torch/lib/libtorch_python.so)
jbwang1997 commented 3 years ago

Hi, I think you can imitate the setup.py in your repo's roialign. You just need to replace the .cpp and .cu files. Then, add the setup command in the compile.sh.

i made a setup.py

like this ;

from setuptools import setup
from torch.utils.cpp_extension import BuildExtension, CUDAExtension

setup(
    name='riroi_align_cuda',
    ext_modules=[
        CUDAExtension('convex_ext', [
            'src/convex_cpu.cpp',
            'src/convex_ext.cpp',
        ],
        sources_cuda=['src/convex_cuda.cu']
        ),
    ],
    cmdclass={'build_ext': BuildExtension})

and add it in compile.sh like this;

echo "Building convex op..."
cd ../convex
if [ -d "build" ]; then
    rm -r build
fi
$PYTHON setup.py build_ext --inplace

is it true??

the name='riroi_align_cuda' should change to name='convert_ext'. the .cu files should also in CUDAExtension and delete the sources_cuda.

mmoghadam11 commented 3 years ago

convert_ext tnx for answering i change it and it makes three files

  • convex/build/temp.linux-x86_64-3.7/src/convex_cpu.o
  • convex/build/temp.linux-x86_64-3.7/src/convex_cuda.o
  • convex/build/temp.linux-x86_64-3.7/src/convex_ext.o

after that i went for training the training starts and download pretrain resnet50 after that gives this err:

File "/content/ReDet/mmdet/models/losses/obb/poly_iou_loss.py", line 20, in convex_areas
    index = convex_sort(pts, masks)
  File "/content/ReDet/mmdet/ops/convex/convex_wrapper.py", line 23, in convex_sort
    return convex_sort_func(pts, masks, circular)
  File "/content/ReDet/mmdet/ops/convex/convex_wrapper.py", line 11, in forward
    idx = convex_ext.convex_sort(pts, masks, circular)
RuntimeError: sort_vert is not compiled with GPU support
jbwang1997 commented 3 years ago

I read the code in nms and find the CUDA and CPU code need to be setup separately. In convex_ext.cpp, I write CUDA and CPU code in one function, So I think you need to separate the code.

mmoghadam11 commented 3 years ago

tnx for your tips dear @jbwang1997 but i dont understand and get Enough about cuda and cpp codes ): would you make it for me??

this is convex_ext.cpp:

#include <ATen/ATen.h>
#include <torch/extension.h>

#ifdef WITH_CUDA
at::Tensor convex_sort_cuda(
    const at::Tensor& pts, const at::Tensor& masks, const bool circular);
#endif

at::Tensor convex_sort_cpu(
    const at::Tensor& pts, const at::Tensor& masks, const bool circular);

at::Tensor convex_sort(
    const at::Tensor& pts, const at::Tensor& masks, const bool circular) {
  if (pts.device().is_cuda()) {
#ifdef WITH_CUDA
    return convex_sort_cuda(pts, masks, circular);
#else
    AT_ERROR("sort_vert is not compiled with GPU support");
#endif
  }
  return convex_sort_cpu(pts, masks, circular);
}

PYBIND11_MODULE(TORCH_EXTENSION_NAME, m) {
  m.def("convex_sort", &convex_sort, "select the convex points and sort them");
}

and its nms_setup.py:

import os.path as osp
from setuptools import setup, Extension

import numpy as np
from Cython.Build import cythonize
from Cython.Distutils import build_ext
from torch.utils.cpp_extension import BuildExtension, CUDAExtension

ext_args = dict(
    include_dirs=[np.get_include()],
    language='c++',
    extra_compile_args={
        'cc': ['-Wno-unused-function', '-Wno-write-strings'],
        'nvcc': ['-c', '--compiler-options', '-fPIC'],
    },
)

extensions = [
    Extension('soft_nms_cpu', ['src/soft_nms_cpu.pyx'], **ext_args),
]

def customize_compiler_for_nvcc(self):
    """inject deep into distutils to customize how the dispatch
    to cc/nvcc works.
    If you subclass UnixCCompiler, it's not trivial to get your subclass
    injected in, and still have the right customizations (i.e.
    distutils.sysconfig.customize_compiler) run on it. So instead of going
    the OO route, I have this. Note, it's kindof like a wierd functional
    subclassing going on."""

    # tell the compiler it can processes .cu
    self.src_extensions.append('.cu')

    # save references to the default compiler_so and _comple methods
    default_compiler_so = self.compiler_so
    super = self._compile

    # now redefine the _compile method. This gets executed for each
    # object but distutils doesn't have the ability to change compilers
    # based on source extension: we add it.
    def _compile(obj, src, ext, cc_args, extra_postargs, pp_opts):
        if osp.splitext(src)[1] == '.cu':
            # use the cuda for .cu files
            self.set_executable('compiler_so', 'nvcc')
            # use only a subset of the extra_postargs, which are 1-1 translated
            # from the extra_compile_args in the Extension class
            postargs = extra_postargs['nvcc']
        else:
            postargs = extra_postargs['cc']

        super(obj, src, ext, cc_args, postargs, pp_opts)
        # reset the default compiler_so, which we might have changed for cuda
        self.compiler_so = default_compiler_so

    # inject our redefined _compile method into the class
    self._compile = _compile

class custom_build_ext(build_ext):

    def build_extensions(self):
        customize_compiler_for_nvcc(self.compiler)
        build_ext.build_extensions(self)

setup(
    name='soft_nms',
    cmdclass={'build_ext': custom_build_ext},
    ext_modules=cythonize(extensions),
)

setup(
    name='nms_cuda',
    ext_modules=[
        CUDAExtension('nms_cuda', [
            'src/nms_cuda.cpp',
            'src/nms_kernel.cu',
        ]),
        CUDAExtension('nms_cpu', [
            'src/nms_cpu.cpp',
        ]),
    ],
    cmdclass={'build_ext': BuildExtension})

our setup for convex_ext is:

from setuptools import setup
from torch.utils.cpp_extension import BuildExtension, CUDAExtension

setup(
    name='convex_ext',
    ext_modules=[
        CUDAExtension('convex_ext', [
            'src/convex_cpu.cpp',
            'src/convex_ext.cpp',
          'src/convex_cuda.cu'
        ]
#         sources_cuda=['src/convex_cuda.cu']
        ),
    ],
    cmdclass={'build_ext': BuildExtension})

so you say we must separate the convex_ext.cpp function. but i dont know how. could you separate them here pls??

mmoghadam11 commented 3 years ago

i commented the cpu function in convex_ext.cpp and it works and the train starts. but in training among of epoch i got this error and training stoped.

/usr/local/lib/python3.7/dist-packages/mmcv-0.2.13-py3.7-linux-x86_64.egg/mmcv/runner/hooks/optimizer.py:13: FutureWarning: Non-finite norm encountered in torch.nn.utils.clip_grad_norm_; continuing anyway. Note that the default behavior will change in a future release to error out if a non-finite total norm is encountered. At that point, setting error_if_nonfinite=false will be required to retain the old behavior.
  filter(lambda p: p.requires_grad, params), **self.grad_clip)
Traceback (most recent call last):
  File "tools/train.py", line 95, in <module>
    main()
  File "tools/train.py", line 91, in main
    logger=logger)
  File "/content/ReDet/mmdet/apis/train.py", line 61, in train_detector
    _non_dist_train(model, dataset, cfg, validate=validate)
  File "/content/ReDet/mmdet/apis/train.py", line 197, in _non_dist_train
    runner.run(data_loaders, cfg.workflow, cfg.total_epochs)
  File "/usr/local/lib/python3.7/dist-packages/mmcv-0.2.13-py3.7-linux-x86_64.egg/mmcv/runner/runner.py", line 358, in run
    epoch_runner(data_loaders[i], **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/mmcv-0.2.13-py3.7-linux-x86_64.egg/mmcv/runner/runner.py", line 264, in train
    self.model, data_batch, train_mode=True, **kwargs)
  File "/content/ReDet/mmdet/apis/train.py", line 39, in batch_processor
    losses = model(**data)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/parallel/data_parallel.py", line 166, in forward
    return self.module(*inputs[0], **kwargs[0])
  File "/usr/local/lib/python3.7/dist-packages/torch/nn/modules/module.py", line 1051, in _call_impl
    return forward_call(*input, **kwargs)
  File "/content/ReDet/mmdet/models/detectors/base_new.py", line 95, in forward
    return self.forward_train(img, img_meta, **kwargs)
  File "/content/ReDet/mmdet/models/detectors/RoITransformer.py", line 217, in forward_train
    gt_labels[i])
  File "/content/ReDet/mmdet/core/bbox/assigners/max_iou_assigner_rbbox.py", line 73, in assign
    raise ValueError('No gt or bboxes')
ValueError: No gt or bboxes

it doesnt happend in other configs

jbwang1997 commented 3 years ago

This error appeared in the assigner. This is not the error of PolyIoULoss.

It seems you feed an image without ground truth in the model.