compatibility with mmdetection-to-tensorrt and deepstream

xarauzo commented 2 years ago

Hi, i am using MMdet to create a detection model (mmdet 2.25.1, mmcv-full 1.6.1). I now want to convert it to TensorRT to use it on deepstream.

I have been trying with versions 0.5.0 of amirstan__plugin, mmdet-to-trt and torch2trt_dynamic, and also the latest versions of that, but I have not managed to make it work. Am I missing something regards compatibility?

I am running it in a docker container on a Jetson Xavier. When running the conversion, i get a MemoryError (similar as mmdet-to-trt issue 38. I tried the solution (commenting the torch.save line and using --save-engine=true). If I do that, however, I get the .engine file, but when running the deepstream pipeline I get this error:

python3: /amirstan_plugin/src/plugin/common/serialize.hpp:49: static void {anonymous}::Serializer<T, typename std::enable_if<((std::is_arithmetic<_Tp>::value || std::is_enum<_Tp>::value) || std::is_pod<_Tp>::value)>::type>::deserialize(const void**, size_t*, T*) [with T = int; size_t = long unsigned int]: Assertion `*buffer_size >= sizeof(T)' failed.

Can I get some help with this, please?

grimoire commented 2 years ago

The log indicates that the buffer size is smaller than a scalar. There must be something wrong when saving the engine. Could you provide more detail?

xarauzo commented 2 years ago

I am using mmdet2trt to convert the model to a ".engine" model. To save the model, I first tried running the mmdet2trt app, and the "torch.save(trt_model.state_dict(), args.output)" would save the model. However, then doing so, I get the following:

/root/space/mmdetection/mmdet/models/dense_heads/anchor_head.py:123: UserWarning: DeprecationWarning: anchor_generator is deprecated, please use "prior_generator" instead
  warnings.warn('DeprecationWarning: anchor_generator is deprecated, '
/usr/local/lib/python3.6/dist-packages/torch/nn/functional.py:718: UserWarning: Named tensors and all their associated APIs are an experimental feature and subject to change. Please do not use them for anything important until they are released as stable. (Triggered internally at  /media/nvidia/NVME/pytorch/pytorch-v1.9.0/c10/core/TensorImpl.h:1156.)
  return torch.max_pool2d(input, kernel_size, stride, padding, dilation, ceil_mode)
/root/space/mmdetection/mmdet/core/anchor/anchor_generator.py:370: UserWarning: ``single_level_grid_anchors`` would be deprecated soon. Please use ``single_level_grid_priors`` 
  '``single_level_grid_anchors`` would be deprecated soon. '
Warning: Encountered known unsupported method torch.Tensor.new_tensor
Warning: Encountered known unsupported method torch.Tensor.new_tensor
Warning: Encountered known unsupported method torch.Tensor.new_tensor
Warning: Encountered known unsupported method torch.Tensor.new_tensor
Warning: Encountered known unsupported method torch.Tensor.new_tensor
Warning: Encountered known unsupported method torch.Tensor.new_tensor

And when it finishes the conversion, with the following error:

Traceback (most recent call last):
  File "/usr/local/bin/mmdet2trt", line 33, in <module>
    sys.exit(load_entry_point('mmdet2trt', 'console_scripts', 'mmdet2trt')())
  File "/root/space/mmdetection-to-tensorrt/mmdet2trt/mmdet2trt.py", line 339, in main
    torch.save(trt_model.state_dict(), args.output)
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 379, in save
    _save(obj, opened_zipfile, pickle_module, pickle_protocol)
  File "/usr/local/lib/python3.6/dist-packages/torch/serialization.py", line 484, in _save
    pickler.dump(obj)
MemoryError

However, I tried commenting that line and using "--save-engine=true" to save the model, which uses the following:

  if args.save_engine:
      logger.info('Saving TRT model engine to: {}'.format(
          Path(args.output).with_suffix('.engine')))
      with open(Path(args.output).with_suffix('.engine'), 'wb') as f:
          f.write(trt_model.state_dict()['engine'])

This results in no error when finishing the conversion, and the "output.engine" file is saved, at first sight, correctly.

Then I run my deepstream pipeline (which I have proven to work correctly using other TRT models) using the "output.model". When doing so, I get the error I mention in the issue.

If there is anything else I can check for you to help me, just let me know. Thanks for the help.

Version: amirstan_plugin -> main: git clone --depth=1 https://github.com/grimoire/amirstan_plugin.git

grimoire commented 2 years ago

It seems that this error is caused by ops serialization. What model are you using?

xarauzo commented 2 years ago

I am using MMDetection with the CascadeRCNN model and ResNeXt as backbone.

The training is done in a NVIDIA GeForce RTX 3090, with the following versions: mmdet 2.25.1, mmcv-full 1.6.1.

Then the conversion is done (or tried) in the Jetson Xavier NX, with the same versions. For this I tried both amirstan_plugin v0.5.0 and latest.

Thanks for your help, if you need further information for this, let me know.

grimoire / amirstan_plugin

compatibility with mmdetection-to-tensorrt and deepstream #37