ceccocats / tkDNN

Deep neural network library and toolkit to do high performace inference on NVIDIA jetson platforms
GNU General Public License v2.0
718 stars 209 forks source link

Error when convert mobilenetV2ssd to tensorrt my own custom training dataset #224

Open SokPhanith opened 3 years ago

SokPhanith commented 3 years ago

My model have 4 classes and I follow training on this source : https://github.com/mive93/pytorch-ssd. I test a simple on jetson nano it work around 12fps and output 2 folder layers and debug and then copy to build folder I go to change in tests/mobilent/mobilenetv2ssd/mobilenetv2ssd.cpp change classes = 5 and then go to path src/MobilenetDetection.cpp add more class like : else if(classes == 5){ const char *classesnames[] = { "a","b","c","d"}; classesNames = std::vector(classesnames, std::end(classesnames));} and build again. when run: ./test_mobilenetv2ssd Error like this : New NETWORK (tkDNN v0.5, CUDNN v8) Reading weights: I=3 O=32 KERNEL=3x3x1 Reading weights: I=1 O=32 KERNEL=3x3x1 Reading weights: I=32 O=16 KERNEL=1x1x1 Reading weights: I=16 O=96 KERNEL=1x1x1 Reading weights: I=1 O=96 KERNEL=3x3x1 Reading weights: I=96 O=24 KERNEL=1x1x1 Reading weights: I=24 O=144 KERNEL=1x1x1 Reading weights: I=1 O=144 KERNEL=3x3x1 Reading weights: I=144 O=24 KERNEL=1x1x1 Reading weights: I=24 O=144 KERNEL=1x1x1 Reading weights: I=1 O=144 KERNEL=3x3x1 Reading weights: I=144 O=32 KERNEL=1x1x1 Reading weights: I=32 O=192 KERNEL=1x1x1 Reading weights: I=1 O=192 KERNEL=3x3x1 Reading weights: I=192 O=32 KERNEL=1x1x1 Reading weights: I=32 O=192 KERNEL=1x1x1 Reading weights: I=1 O=192 KERNEL=3x3x1 Reading weights: I=192 O=32 KERNEL=1x1x1 Reading weights: I=32 O=192 KERNEL=1x1x1 Reading weights: I=1 O=192 KERNEL=3x3x1 Reading weights: I=192 O=64 KERNEL=1x1x1 Reading weights: I=64 O=384 KERNEL=1x1x1 Reading weights: I=1 O=384 KERNEL=3x3x1 Reading weights: I=384 O=64 KERNEL=1x1x1 Reading weights: I=64 O=384 KERNEL=1x1x1 Reading weights: I=1 O=384 KERNEL=3x3x1 Reading weights: I=384 O=64 KERNEL=1x1x1 Reading weights: I=64 O=384 KERNEL=1x1x1 Reading weights: I=1 O=384 KERNEL=3x3x1 Reading weights: I=384 O=64 KERNEL=1x1x1 Reading weights: I=64 O=384 KERNEL=1x1x1 Reading weights: I=1 O=384 KERNEL=3x3x1 Reading weights: I=384 O=96 KERNEL=1x1x1 Reading weights: I=96 O=576 KERNEL=1x1x1 Reading weights: I=1 O=576 KERNEL=3x3x1 Reading weights: I=576 O=96 KERNEL=1x1x1 Reading weights: I=96 O=576 KERNEL=1x1x1 Reading weights: I=1 O=576 KERNEL=3x3x1 Reading weights: I=576 O=96 KERNEL=1x1x1 Reading weights: I=96 O=576 KERNEL=1x1x1 Reading weights: I=1 O=576 KERNEL=3x3x1 Reading weights: I=576 O=160 KERNEL=1x1x1 Reading weights: I=160 O=960 KERNEL=1x1x1 Reading weights: I=1 O=960 KERNEL=3x3x1 Reading weights: I=960 O=160 KERNEL=1x1x1 Reading weights: I=160 O=960 KERNEL=1x1x1 Reading weights: I=1 O=960 KERNEL=3x3x1 Reading weights: I=960 O=160 KERNEL=1x1x1 Reading weights: I=160 O=960 KERNEL=1x1x1 Reading weights: I=1 O=960 KERNEL=3x3x1 Reading weights: I=960 O=320 KERNEL=1x1x1 Reading weights: I=320 O=1280 KERNEL=1x1x1 Reading weights: I=1280 O=256 KERNEL=1x1x1 Reading weights: I=1 O=256 KERNEL=3x3x1 Reading weights: I=256 O=512 KERNEL=1x1x1 Reading weights: I=512 O=128 KERNEL=1x1x1 Reading weights: I=1 O=128 KERNEL=3x3x1 Reading weights: I=128 O=256 KERNEL=1x1x1 Reading weights: I=256 O=128 KERNEL=1x1x1 Reading weights: I=1 O=128 KERNEL=3x3x1 Reading weights: I=128 O=256 KERNEL=1x1x1 Reading weights: I=256 O=64 KERNEL=1x1x1 Reading weights: I=1 O=64 KERNEL=3x3x1 Reading weights: I=64 O=64 KERNEL=1x1x1 Reading weights: I=1 O=576 KERNEL=3x3x1 Reading weights: I=576 O=126 KERNEL=1x1x1 Error reading file mobilenetv2ssd/layers/classification_headers-0-3.bin with n of float: 72576 seek: 0 size: 290304

/home/phanith/tkDNN/src/utils.cpp:58 Aborting...

daynauth commented 3 years ago

Go to line 322 where this block of code is in mobilenetv2ssd.cpp

// classification header 0
  tk::dnn::Layer *header_0[1] = {&relu_14_1};
  tk::dnn::Route rout_ch_0(&net, header_0, 1);
  tk::dnn::Conv2d ch_0_conv1(&net, 576, 3, 3, 1, 1, 1, 1, classification_header0[0], true, false, 576, true);
  tk::dnn::Activation ch_relu_0_1(&net, CUDNN_ACTIVATION_CLIPPED_RELU, 6);
  tk::dnn::Conv2d ch_0_conv2(&net, 126, 1, 1, 1, 1, 0, 0, classification_header0[1], false);
  tk::dnn::Layer *conf0[1] = {&ch_0_conv2};

Change it to

// classification header 0
  tk::dnn::Layer *header_0[1] = {&relu_14_1};
  tk::dnn::Route rout_ch_0(&net, header_0, 1);
  tk::dnn::Conv2d ch_0_conv1(&net, 576, 3, 3, 1, 1, 1, 1, classification_header0[0], true, false, 576, true);
  tk::dnn::Activation ch_relu_0_1(&net, CUDNN_ACTIVATION_CLIPPED_RELU, 6);
  tk::dnn::Conv2d ch_0_conv2(&net, 6 * classes, 1, 1, 1, 1, 0, 0, classification_header0[1], false); //change 126 to 6 x classes
  tk::dnn::Layer *conf0[1] = {&ch_0_conv2};

Do that for the next 4 blocks

SokPhanith commented 3 years ago

Thanks you @daynauth. I can covert it to TensorRT engine but result detection not good when I got the result like below : fp32. | [ 0 ]: 1.5204 1.54345 | [ 2 ]: 1.87475 1.71696 | [ 3 ]: -1.00988 -0.880602 | [ 4 ]: 1.60527 1.77853 | [ 5 ]: -0.931939 -0.720109 | [ 6 ]: 1.71499 1.59624 | [ 7 ]: -0.768118 -0.959083 | [ 8 ]: 1.04617 1.27626 | [ 9 ]: -1.49756 -1.43087 | Wrongs: 10 ~0.02

| [ 0 ]: -0.229185 0.0469791 | [ 1 ]: -0.0839813 0.0822884 | [ 3 ]: 0.011675 -0.0248753 | [ 4 ]: -0.475023 -0.607493 | [ 5 ]: 0.951046 0.775346 | [ 6 ]: 0.272973 -0.0831814 | [ 7 ]: -0.949614 -0.753534 | [ 8 ]: -0.63458 -0.810912 | [ 9 ]: -0.128321 -0.210743 | Wrongs: 23 ~0.02 TRT vs correct | OK ~0.02 | OK ~0.02 CUDNN vs TRT

| [ 0 ]: 1.5204 1.54342 | [ 2 ]: 1.87475 1.717 | [ 3 ]: -1.00988 -0.880599 | [ 4 ]: 1.60527 1.77856 | [ 5 ]: -0.931939 -0.720109 | [ 6 ]: 1.71499 1.59627 | [ 7 ]: -0.768118 -0.959077 | [ 8 ]: 1.04617 1.27621 | [ 9 ]: -1.49756 -1.4309 | Wrongs: 10 ~0.02

| [ 0 ]: -0.229185 0.0469934 | [ 1 ]: -0.0839813 0.0822947 | [ 3 ]: 0.011675 -0.0249243 | [ 4 ]: -0.475023 -0.607436 | [ 5 ]: 0.951046 0.775358 | [ 6 ]: 0.272973 -0.08317 | [ 7 ]: -0.949614 -0.753506 | [ 8 ]: -0.63458 -0.810901 | [ 9 ]: -0.128321 -0.21073 | Wrongs: 23 ~0.02

Confidence CUDNN 0.968103 0.983429 0.976511 0.964841 0.972451 0.958528 0.946463 0.975324 0.960998 0.960323 0.943285 0.959839 0.960271 0.972455 0.963304 0.954913 0.949077 0.962999 0.963978 0.977672 0.964372 0.95889 0.956088 0.965915 0.962634 0.974817 0.963763 0.956963 0.952936 0.96475 0.962536 0.976747 0.964135 0.958629 0.954504 0.964761 0.962134 0.975759 0.963472 0.958036 0.953169 0.964874 0.963055 0.976195 0.96415 0.958045 0.953692 0.964858 0.963428 0.976473 0.964418 0.958382 0.953735 0.965465 0.964098 0.976559 0.964723 0.958736 0.954005 0.965941 0.965202 0.976882 0.965352 0.959038 Locations CUDNN 0.596325 -0.21843 -0.331537 -0.595714 0.314037 0.10484 0.128811 -0.422746 0.322055 -0.107347 -0.413898 0.408575 0.435567 -0.170629 0.313 0.108882 -0.343173 0.352839 -0.515181 -0.120012 -0.331135 0.287089 -0.187266 -0.472478 0.0271642 0.0118162 -0.415585 -0.567555 -0.100072 0.364114 0.15811 -0.484276 0.0284767 0.0145648 0.117571 0.284106 0.281017 -0.191696 -0.196008 -0.131639 -0.420112 0.308882 -0.382679 -0.0109391 0.041557 0.314878 0.0364166 -0.611701 0.137263 -0.126949 -0.34802 -0.47309 -0.225359 0.307527 0.28922 -0.405273 0.137618 -0.142535 0.0690397 0.387937 0.231679 -0.288088 -0.0269699 -0.0478161

Confidence tensorRT 0.709769 0.773645 0.707555 0.723442 0.751323 0.655424 0.65149 0.725766 0.645195 0.683386 0.702866 0.627929 0.65181 0.691365 0.652505 0.668489 0.67884 0.654254 0.681326 0.696739 0.681607 0.704441 0.696581 0.688985 0.700019 0.700529 0.680472 0.71342 0.686722 0.684848 0.713161 0.710266 0.686678 0.725065 0.700927 0.688647 0.718197 0.716066 0.693542 0.731514 0.71132 0.695148 0.712736 0.716982 0.69969 0.727322 0.708541 0.694716 0.680874 0.694499 0.669404 0.707404 0.678 0.666091 0.663432 0.694448 0.671 0.68643 0.683284 0.65518 0.676395 0.6988 0.671485 0.687283 Locations tensorRT 0.54878 0.712039 0.948913 -1.65793 0.228692 0.0525467 -0.227251 0.294692 1.32709 0.482004 1.51285 -0.289476 -0.0345718 0.467677 1.00579 -0.108315 -0.00655457 0.431376 0.502219 0.105728 1.52518 1.41894 -0.385961 0.446796 1.17333 2.37723 1.13329 -0.782319 -0.0317853 0.834018 -1.15161 0.139532 0.906155 0.866462 2.06094 -1.07161 -1.04482 0.842816 -0.799282 -0.223865 0.168965 0.409411 1.10863 -0.806674 2.84671 -1.00001 -0.512979 0.820814 0.912113 1.44146 0.689703 -1.13382 -0.0690976 -0.338278 -0.370342 -0.0790475 -0.127042 -0.113491 0.916889 -0.861763 -0.162219 0.133995 -0.611345 -0.349006

CUDNN vs TRT

| [ 0 ]: 0.968103 0.709769 | [ 1 ]: 0.983429 0.773645 | [ 2 ]: 0.976511 0.707555 | [ 3 ]: 0.964841 0.723442 | [ 4 ]: 0.972451 0.751323 | [ 5 ]: 0.958528 0.655424 | [ 6 ]: 0.946463 0.65149 | [ 7 ]: 0.975324 0.725766 | [ 8 ]: 0.960998 0.645195 | Wrongs: 5988 ~0.02

| [ 0 ]: 0.596325 0.54878 | [ 1 ]: -0.21843 0.712039 | [ 2 ]: -0.331537 0.948913 | [ 3 ]: -0.595714 -1.65793 | [ 4 ]: 0.314037 0.228692 | [ 5 ]: 0.10484 0.0525467 | [ 6 ]: 0.128811 -0.227251 | [ 7 ]: -0.422746 0.294692 | [ 8 ]: 0.322055 1.32709 | Wrongs: 11711 ~0.02

fp16 ==== RESNET CHECK RESULTS === CUDNN vs correct

| [ 0 ]: 1.5204 1.54345 | [ 2 ]: 1.87475 1.71696 | [ 3 ]: -1.00988 -0.880602 | [ 4 ]: 1.60527 1.77853 | [ 5 ]: -0.931939 -0.720109 | [ 6 ]: 1.71499 1.59624 | [ 7 ]: -0.768118 -0.959083 | [ 8 ]: 1.04617 1.27626 | [ 9 ]: -1.49756 -1.43087 | Wrongs: 10 ~0.02

| [ 0 ]: -0.229185 0.0469791 | [ 1 ]: -0.0839813 0.0822884 | [ 3 ]: 0.011675 -0.0248753 | [ 4 ]: -0.475023 -0.607493 | [ 5 ]: 0.951046 0.775346 | [ 6 ]: 0.272973 -0.0831814 | [ 7 ]: -0.949614 -0.753534 | [ 8 ]: -0.63458 -0.810912 | [ 9 ]: -0.128321 -0.210743 | Wrongs: 23 ~0.02 TRT vs correct

| [ 0 ]: 0 1.54345 | [ 1 ]: 0 -1.09417 | [ 2 ]: 0 1.71696 | [ 3 ]: 0 -0.880602 | [ 4 ]: 0 1.77853 | [ 5 ]: 0 -0.720109 | [ 6 ]: 0 1.59624 | [ 7 ]: 0 -0.959083 | [ 8 ]: 0 1.27626 | Wrongs: 12 ~0.02

| [ 0 ]: 0 0.0469791 | [ 1 ]: 0 0.0822884 | [ 2 ]: 0 0.871726 | [ 3 ]: 0 -0.0248753 | [ 4 ]: 0 -0.607493 | [ 5 ]: 0 0.775346 | [ 6 ]: 0 -0.0831814 | [ 7 ]: 0 -0.753534 | [ 8 ]: 0 -0.810912 | Wrongs: 24 ~0.02 CUDNN vs TRT

| [ 0 ]: 1.5204 0 | [ 1 ]: -1.07767 0 | [ 2 ]: 1.87475 0 | [ 3 ]: -1.00988 0 | [ 4 ]: 1.60527 0 | [ 5 ]: -0.931939 0 | [ 6 ]: 1.71499 0 | [ 7 ]: -0.768118 0 | [ 8 ]: 1.04617 0 | Wrongs: 12 ~0.02

| [ 0 ]: -0.229185 0 | [ 1 ]: -0.0839813 0 | [ 2 ]: 0.890979 0 | [ 4 ]: -0.475023 0 | [ 5 ]: 0.951046 0 | [ 6 ]: 0.272973 0 | [ 7 ]: -0.949614 0 | [ 8 ]: -0.63458 0 | [ 9 ]: -0.128321 0 | Wrongs: 23 ~0.02

Confidence CUDNN 0.968103 0.983429 0.976511 0.964841 0.972451 0.958528 0.946463 0.975324 0.960998 0.960323 0.943285 0.959839 0.960271 0.972455 0.963304 0.954913 0.949077 0.962999 0.963978 0.977672 0.964372 0.95889 0.956088 0.965915 0.962634 0.974817 0.963763 0.956963 0.952936 0.96475 0.962536 0.976747 0.964135 0.958629 0.954504 0.964761 0.962134 0.975759 0.963472 0.958036 0.953169 0.964874 0.963055 0.976195 0.96415 0.958045 0.953692 0.964858 0.963428 0.976473 0.964418 0.958382 0.953735 0.965465 0.964098 0.976559 0.964723 0.958736 0.954005 0.965941 0.965202 0.976882 0.965352 0.959038 Locations CUDNN 0.596325 -0.21843 -0.331537 -0.595714 0.314037 0.10484 0.128811 -0.422746 0.322055 -0.107347 -0.413898 0.408575 0.435567 -0.170629 0.313 0.108882 -0.343173 0.352839 -0.515181 -0.120012 -0.331135 0.287089 -0.187266 -0.472478 0.0271642 0.0118162 -0.415585 -0.567555 -0.100072 0.364114 0.15811 -0.484276 0.0284767 0.0145648 0.117571 0.284106 0.281017 -0.191696 -0.196008 -0.131639 -0.420112 0.308882 -0.382679 -0.0109391 0.041557 0.314878 0.0364166 -0.611701 0.137263 -0.126949 -0.34802 -0.47309 -0.225359 0.307527 0.28922 -0.405273 0.137618 -0.142535 0.0690397 0.387937 0.231679 -0.288088 -0.0269699 -0.0478161

Confidence tensorRT nan 0.5 0.5 0.5 0.5 0.5 nan 0.5 0.5 0.5 0.5 0.5 nan 0.5 0.5 0.5 0.5 0.5 nan 0.5 0.5 0.5 0.5 0.5 nan 0.5 0.5 0.5 0.5 0.5 nan 0.5 0.5 0.5 0.5 0.5 nan 0.5 0.5 0.5 0.5 0.5 nan 0.5 0.5 0.5 0.5 0.5 nan 0.5 0.5 0.5 0.5 0.5 nan 0.5 0.5 0.5 0.5 0.5 nan 0.5 0.5 0.5 Locations tensorRT 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0

CUDNN vs TRT

| [ 0 ]: 0.968103 nan | [ 1 ]: 0.983429 0.5 | [ 2 ]: 0.976511 0.5 | [ 3 ]: 0.964841 0.5 | [ 4 ]: 0.972451 0.5 | [ 5 ]: 0.958528 0.5 | [ 6 ]: 0.946463 nan | [ 7 ]: 0.975324 0.5 | [ 8 ]: 0.960998 0.5 | Wrongs: 6000 ~0.02

| [ 0 ]: 0.596325 0 | [ 1 ]: -0.21843 0 | [ 2 ]: -0.331537 0 | [ 3 ]: -0.595714 0 | [ 4 ]: 0.314037 0 | [ 5 ]: 0.10484 0 | [ 6 ]: 0.128811 0 | [ 7 ]: -0.422746 0 | [ 8 ]: 0.322055 0 | Wrongs: 11352 ~0.02

when I read demo with high thresh-hold ./demo mobilenetv2ssd_fp32.rt cnc_drive.mp4 m 1 1 1 0.5 or ./demo mobilenetv2ssd_fp16.rt cnc_drive.mp4 m 1 1 1 0.5 it's have a lot of bbox on my detection or live from webcam Screenshot from 2021-05-08 22-08-48