NVIDIA-AI-IOT / tensorrt_plugin_generator

A simple tool that can generate TensorRT plugin code quickly.
MIT License
221 stars 35 forks source link

Debugging the Custom Plugins #2

Open jeethesh-pai opened 1 year ago

jeethesh-pai commented 1 year ago

Thank you so much for this wonderful repository. This helped to create custom plugin library in the most efficient way. Is there a way to test the enqueue function because there seems to be some bugs associated with the kernel calculations?

Thanks in advance Jeethesh

zerollzeng commented 1 year ago

Do you mean you met error during enqueue? should be caused by the kernel implementation?

zerollzeng commented 1 year ago

I would suggest to debug it using cuda-gdb

jeethesh-pai commented 1 year ago

Thank you @zerollzeng, It is due to kernel implementation. I was able to debug using cuda-gdb but since Enqueue function inputs are pointers void const *const *inputs, void *const *outputs. I am not able to figure out how the array is passed to enqeue function. What the size of array is?

zerollzeng commented 1 year ago

I am not able to figure out how the array is passed to enqeue function. What the size of array is?

It's the input of you plugin, so it depends on how do you configure it.

jeethesh-pai commented 1 year ago

Hi @zerollzeng

if my input of my Plugin is something like this

MSDeformAttentionPlugin:
  attributes:
    im2col_step:
      datatype: int32
  inputs:
    tpg_input_0: # value
      shape: -1x18259x8x32
    tpg_input_1: # value_spatial_shapes
      shape: 4x2
    tpg_input_2: # value_level_start_index
      shape: 4
    tpg_input_3: # sampling_locations
      shape: -1x-1x8x4x4x2
    tpg_input_4: # attention_weights
      shape: -1x-1x8x4x4
  outputs:
    tpg_output_0:
      shape: -1x-1x256
  plugin_type: IPluginV2DynamicExt
  support_format_combination:
    - "float32+int32+int32+float32+float32+float32"

will the inputs be passed with the same style as passed here or will they be flattened?

e.g. should I access the tpg_input_1 like

const float* var1 = reinterpret_cast<const float *>(inputs[0]);
and then access individual elements like var1[0][18256][31] ?

Thanks

zerollzeng commented 1 year ago

e.g. should I access the tpg_input_1 like

const float* var1 = reinterpret_cast<const float *>(inputs[0]);
and then access individual elements like var1[0][18256][31] ?

Yes.

zerollzeng commented 1 year ago

@jeethesh-pai You can pull the latest change, I made the default input/output data format in linear format.

jeethesh-pai commented 1 year ago

I just cloned the new version, can you tell me how you debug this library file. Right now I am using

cuda-gdb trtexec_debug --loadEngine "someEnginefile.engine" --workspace 16000 --plugins <path to libPlugin.so file> 

But I get the value of var1[0][18256][31] or var1[2] or anything as 0. Is this because trtexec application inputs some random zero values during the inference time or am I not accessing the plugin variable incorrectly? Is there a way to debug this shared library without attaching to trtexec application ? can I supply a random input which will make it more deterministc to debug for me.

Thanks a lot for your help Jeethesh

zerollzeng commented 1 year ago

trtexec use random inputs. There is also a --loadInputs option that let you load a real input in binary.

jeethesh-pai commented 1 year ago

Thanks. I will try this and reach out

jeethesh-pai commented 1 year ago

I tried using the method mentioned here. But still the variable inputs says 0 everywhere.

qixuxiang commented 1 year ago

I tried MultiScaleDeformableAttn plugin of TensoRT8 on TensorRT7, I can convert onnx to trt file via trtexec, buts output of MultiScaleDeformableAttn plugin is all zero, just same as here

18290888765 commented 8 months ago

我在TensorRT7上尝试了TensoRT8的MultiScaleDeformableAttn插件,我可以通过trtexec将onnx转换为trt文件,但是MultiScaleDeformableAttn插件的输出全为零,就像这里一样

请问你解决了这个问题了吗?我也遇到了相同的问题