NVIDIA-AI-IOT / CUDA-PointPillars

A project demonstrating how to use CUDA-PointPillars to deal with cloud points data from lidar.
Apache License 2.0
502 stars 148 forks source link

Mismatch in number of Detections with TFRT onnx inference VS pytorch, pth file #83

Open Allamrahul opened 1 year ago

Allamrahul commented 1 year ago

Dataset: I am using a custom dataset with npy files and annotations. I followed all steps required for custom dataset preparation and I am able to get great results with pytorch with 90% map on my eval set.

However, once I convert the pth file to onnx format using exporter.py, for every point cloud in my eval dataset, I am seeing relatively smaller number of detections using TFRT inference with the cpp script as opposed to what I am getting using pytorch with the pth file.

In regard to the export process, exporter.py and simplifier_onnx.py are being used in the script. However, both scripts are hardcoded for 3 classes for kitti dataset. I have just one class to detect. Hence, I referred to the following commit to make the onnx export work: https://github.com/NVIDIA-AI-IOT/CUDA-PointPillars/pull/77/commits. After this , I was able to export but I faced the following issue after this: https://github.com/NVIDIA-AI-IOT/CUDA-PointPillars/issues/82. I resolved this by tinkering with the export script, as mentioned on the following comment: https://github.com/NVIDIA-AI-IOT/CUDA-PointPillars/pull/77#issuecomment-1424700901. After this, my detections using TFRT onnx were atleast a subset of what I was seeing with pytorch but not the whole set. There is a clear delta between TFRT onnx and pytorch pth combo, in majority of my eval set. This can be seen in the following table:

Bounding box delta comparision: pytorch .pth VS TensorFlow RT onnx

<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">

File | Pytorch pth | TFRT cpp using .onnx file | Delta -- | -- | -- | -- 000000.npy | tensor([[  9.6498,   1.1609,   1.9397,   0.2856,   0.4898,   2.8947,   6.2814],         [ 24.8358,   1.3459,   2.5912,   0.2332,   0.4984,   3.0438,   6.2827],         [ 24.9936, -10.4810,   3.2429,   0.2568,   0.4702,   3.1647,   6.2816],         [  9.8542, -10.6894,   2.1888,   0.4316,   0.4553,   2.7412,   6.2486]],        device='cuda:0') | 24.8358 1.34592 2.59117 0.23324 0.498444 3.04383 6.28266 0 0.46325 ; 24.9936 -10.481 3.24294 0.256755 0.47017 3.16474 6.28156 0 0.445165 ; 9.8573 -10.6925 2.17166 0.433223 0.452724 2.7258 6.24912 0 0.445157 | 1 000001.npy | tensor([[  9.6501,   1.1778,   1.8507,   0.2533,   0.4935,   2.7208,   6.2741],         [ 24.9947, -10.4883,   3.0557,   0.2706,   0.4838,   3.0915,   6.2594],         [ 24.8404,   1.3479,   2.6033,   0.2287,   0.4947,   3.0391,   6.2825],         [  9.8570, -10.6883,   2.1663,   0.4322,   0.4521,   2.7124,   6.2346]],        device='cuda:0') | 9.65337 1.1817 1.80798 0.248034 0.493837 2.66008 6.27361 0 ; 24.9947 -10.4883 3.05572 0.270619 0.483843 3.09145 6.25942 0 0.670895 ; 24.8404 1.34787 2.60326 0.228719 0.494724 3.03909 6.28252 0 0.459299 ; 9.8545 -10.6925 2.1472 0.438129 0.448904 2.7132 6.23376 0 0.424986 | 0 000002.npy | tensor([[  9.6042,   1.1503,   2.0593,   0.2839,   0.4955,   2.9902,   6.3128],         [ 24.7882,   1.3638,   2.6522,   0.2538,   0.5039,   3.1623,   6.2903],         [  9.7436, -10.6760,   2.1350,   0.3712,   0.4578,   2.6609,   6.2507],         [ 24.9494, -10.5134,   3.2150,   0.2888,   0.4944,   3.3462,   6.2143]],        device='cuda:0') | 9.74478 -10.6817 2.1041 0.374984 0.453993 2.63108 6.25019 0 0.532783 ; 24.9494 -10.5134 3.21504 0.288844 0.494413 3.34624 6.21432 0 0.515557 ; 0.309276 -10.6853 2.08503 0.458935 0.413923 3.13058 6.09365 0 0.412784 | 1 000003.npy | tensor([[  9.5610, -10.4589,   2.1206,   0.4139,   0.4505,   2.7193,   6.2802],         [ 24.3758,   1.7272,   2.6000,   0.2396,   0.4966,   3.0571,   6.1985],         [ 24.7097, -10.1406,   3.0566,   0.2619,   0.4718,   3.0835,   6.2728],         [  9.2311,   1.3354,   1.8251,   0.2543,   0.4891,   2.7015,   6.2441],         [  8.9262,   7.8720,   2.1033,   0.3872,   0.4424,   2.7067,   6.3819]],        device='cuda:0') | 9.56115 -10.4598 2.09798 0.418282 0.448642 2.68469 6.27597 0 0.735731 ; 24.3758 1.72724 2.59998 0.239596 0.496643 3.05714 6.19854 0 0.629267 ; 24.7097 -10.1406 3.0566 0.26186 0.471776 3.08349 6.27275 0 0.585723 ; 9.21606 1.33047 1.82858 0.254299 0.490583 2.66956 6.23728 0 0.471899 | 1 000004.npy | tensor([[ 6.4732,  2.6481,  1.7006,  0.2879,  0.4678,  2.6444,  6.3118],         [21.4290,  4.8774,  2.5937,  0.2325,  0.5022,  3.1258,  6.4040],         [23.1383, -6.8599,  2.7714,  0.2839,  0.4960,  3.0160,  6.3080],         [ 8.1175, -8.9831,  2.2486,  0.3856,  0.4450,  2.7676,  6.3550]],        device='cuda:0') | 23.1383 -6.85986 2.77142 0.283893 0.495966 3.01596 6.30801 0 0.580739 ; 8.11463 -8.9818 2.12152 0.396575 0.436063 2.65015 6.35895 0 0.429396 | 2 000005.npy | tensor([[ 5.5251,  2.7731,  1.6679,  0.3284,  0.4662,  2.6940,  6.2788],         [20.4834,  5.0487,  2.5489,  0.2769,  0.5241,  3.1817,  6.4027],         [ 7.3220, -8.8810,  2.1011,  0.4506,  0.4281,  2.6641,  6.3688],         [22.2850, -6.6383,  2.6867,  0.2744,  0.4986,  3.0367,  6.3119]],        device='cuda:0') | 7.32207 -8.88152 2.0861 0.445914 0.430497 2.6552 6.36896 0 0.696223 | 3 000006.npy | tensor([[18.0280,  4.9469,  2.4509,  0.3035,  0.5205,  3.1520,  6.3221],         [19.8413, -6.7181,  2.7475,  0.3097,  0.5246,  3.2910,  6.3001],         [ 3.1871,  2.6373,  1.7287,  0.4621,  0.4224,  2.9021,  6.3156],         [ 4.8621, -8.9172,  1.8402,  0.4540,  0.3952,  2.5332,  6.3420],         [32.0742,  7.1384,  3.3039,  0.2361,  0.4806,  3.3647,  6.4108],         [21.2824, 12.1162,  3.6256,  0.2676,  0.4659,  3.5638,  6.5643],         [ 0.6082,  4.4304,  1.8762,  0.4470,  0.4348,  3.4172,  6.2065]],        device='cuda:0') | 4.85492 -8.92965 1.819 0.460386 0.396642 2.5153 6.34298 0 0.494817 | 6 000007.npy | tensor([[18.2038, -6.8837,  2.5308,  0.3099,  0.5277,  3.1208,  6.3168],         [16.5025,  4.7925,  2.3577,  0.3065,  0.5248,  3.0787,  6.3005],         [ 1.5735,  2.6487,  1.6249,  0.5034,  0.4109,  2.6605,  6.3160],         [ 2.2250,  2.7058,  1.8312,  0.4703,  0.4060,  3.0384,  6.3380],         [ 3.2350, -8.9478,  1.8462,  0.4438,  0.4085,  2.5771,  6.3109],         [19.7396, 11.9755,  3.2925,  0.2890,  0.5000,  3.6453,  6.5671],         [ 3.5311,  2.8095,  2.3147,  0.4571,  0.4455,  4.2559,  6.3274],         [30.5054,  6.8140,  3.3753,  0.2804,  0.5016,  3.6093,  6.2777]],        device='cuda:0') | 18.2057 -6.88499 2.4907 0.307031 0.527094 3.07328 6.31815 0 0.636754 ; 16.502 4.79033 2.33373 0.299566 0.523598 3.0561 6.3044 0 0.532995 ; 1.56738 2.64373 1.68283 0.506594 0.412098 2.66617 6.31967 0 0.51762 ; 3.22002 -8.95614 1.8366 0.449459 0.409571 2.56386 6.3068 0 0.431358 ; 2.2279 2.70934 1.85016 0.464891 0.40516 3.07841 6.33425 0 0.391239 ; 19.7397 11.9755 3.29258 0.28902 0.499917 3.64496 6.56848 0 0.381675 | 2 000008.npy | tensor([[ 8.7021, -7.9169,  2.6375,  0.3647,  0.4888,  3.5404,  6.2655],         [ 7.7196,  3.7774,  2.3025,  0.4060,  0.4704,  3.2993,  6.2707],         [22.8483, -6.6640,  3.5341,  0.3350,  0.5277,  4.1040,  6.3141],         [21.7832,  5.1120,  2.8534,  0.2781,  0.5178,  3.2145,  6.1912],         [ 3.2359, -8.4495,  2.0291,  0.4187,  0.4105,  3.2451,  6.2915]],        device='cuda:0') | 8.70127 -7.92042 2.62612 0.36539 0.486129 3.51703 6.26476 0 0.864963 ; 7.6994 3.79393 2.24546 0.40736 0.469539 3.21603 6.25044 0 0.73586 ; 22.8483 -6.66398 3.53411 0.335008 0.527745 4.10398 6.31413 0 0.605781 ; 21.7832 5.11193 2.85462 0.278421 0.517415 3.21271 6.21329 0 0.508611 ; | 1 000009.npy | tensor([[19.5711,  4.7877,  2.6956,  0.3077,  0.5412,  3.3734,  6.2451],         [ 6.3672, -8.0972,  2.7778,  0.4181,  0.4778,  4.1039,  6.2421],         [ 5.4901,  3.6080,  2.3323,  0.4340,  0.4502,  3.7175,  6.2740],         [20.3728, -7.0433,  3.3803,  0.3514,  0.5351,  4.1972,  6.3070],         [26.6330, 11.8861,  3.9950,  0.3089,  0.5019,  4.1503,  6.6127]],        device='cuda:0') | 5.47306 3.61103 2.394 0.432978 0.453338 3.80027 6.32163 0 0.714706 ; 19.5717 4.78751 2.71062 0.308163 0.539413 3.36241 6.27686 0 0.621834 ; 6.35329 -8.10289 2.76789 0.422266 0.47866 4.13415 6.24032 0 0.606208 | 2 000010.npy | tensor([[18.3196,  4.6323,  3.2815,  0.3700,  0.5370,  4.5950,  6.3164],         [ 5.0913, -8.1561,  2.6470,  0.4329,  0.4667,  4.0704,  6.2747],         [19.1831, -7.1906,  3.3499,  0.3578,  0.5279,  4.2080,  6.3127],         [ 2.5482,  4.3696,  1.6065,  0.4281,  0.3918,  2.8003,  6.2634]],        device='cuda:0') | 5.08485 -8.16716 2.64149 0.431825 0.466464 4.03816 6.27571 0 0.731938 ; 19.1846 -7.19002 3.2872 0.352221 0.529464 4.08496 6.31286 0 0.591408 | 2 000011.npy | tensor([[15.3577, -7.3005,  3.0413,  0.3812,  0.5104,  4.2909,  6.3159],         [ 0.6093,  3.4074,  1.9033,  0.5056,  0.4306,  3.3583,  6.1790],         [14.5397,  4.4909,  3.0513,  0.3723,  0.5222,  4.3821,  6.2383],         [30.4700, -6.2796,  4.0225,  0.2914,  0.4843,  3.8403,  6.3179],         [29.6795,  5.5980,  4.0535,  0.2816,  0.4877,  3.9741,  6.2869]],        device='cuda:0') | 0.594493 3.41456 2.11992 0.502219 0.441799 3.74912 6.17387 0 0.828488 ; 15.3587 -7.29961 2.99875 0.375657 0.512654 4.18005 6.31556 0 0.798267 ; 30.47 -6.27963 4.02255 0.29143 0.484331 3.84032 6.31788 0 0.434042 | 2 000012.npy | tensor([[ 11.2944,   4.3980,   3.0133,   0.3911,   0.5198,   4.6365,   6.2670],         [ 26.4576,   5.3648,   3.6263,   0.3002,   0.5062,   3.8833,   6.3176],         [ 12.0963,  -7.3715,   3.0630,   0.3846,   0.5122,   4.3017,   6.2922],         [  8.1463, -12.5014,   2.9129,   0.3691,   0.4980,   3.9686,   6.1562],         [ 27.1433,  -6.4810,   3.9175,   0.3048,   0.5110,   3.9699,   6.3372],         [ 18.4373,  11.4960,   3.7129,   0.3159,   0.4918,   4.2750,   6.4670]],        device='cuda:0') | 8.14566 -12.506 2.84502 0.364298 0.498799 3.84938 6.15557 0 0.378752 ; 12.0904 -7.37816 2.90519 0.378811 0.516209 4.00902 6.29017 0 0.376648 | 4

Please let me know if you know something that could help me.

Allamrahul commented 1 year ago

I see the same behavior with the kitti dataset as well, as follows: image Can anyone confirm if this an expected behavior or is this not supposed to happen?

KwangjinChoi commented 1 year ago

Hello, can you tell me how much the 3D detection performance drops?

Allamrahul commented 1 year ago

Hi, from my initial comment, there is delta as large as 6 in 000006.npy between pytorch pth and TFRT inference. I have about 30 evaluation point clouds and I see this drop in 90 % of them. Is there anything I can do to avoid this?

wangxj2014 commented 1 year ago

I also encountered the same problem. Is there any way to solve this problem?

Dreamdreams8 commented 4 months ago

The same problem. Has anyone solved it?