Closed perseusdg closed 3 years ago
Hi,
nice work!
I have tested on nvidia xavier and seems that all is working nicely.
Can you run the tests also on windows?
you need to download COCO test dataset (for UINT8 calibration):
bash scripts/download_validation.sh COCO
And run this test script:
bash ./scripts/test_all_tests.sh
with line 45 uncommented to check all the inference precisions:
modes=( 1 2 3 ) # FP32, FP16 and INT8
On FP16 and INT8 if you get TENSORRT ERROR it means that the result is not strictly the same as the ground truth which is normal with lower precision.
I ran test_mobilenetv2ssd on both the master branch and the pull request,i believe fatal_error is a result of the differences in ground truth vs trt and trt vs cudnn i have attached the result below.
====== CUDNN inference ======
Data dim: 1 3 300 300 1
Data dim: 1 3000 1 4 1
===== TENSORRT inference ====
Data dim: 1 3 300 300 1
Data dim: 1 3000 1 4 1
==== RESNET CHECK RESULTS ===
CUDNN vs correct
| OK ~0.02
| OK ~0.02
TRT vs correct
| [ 0 ]: 1.13652 0.26275
| [ 1 ]: -0.723999 -0.604409
| [ 2 ]: -2.61438 -2.63829
| [ 3 ]: 0.786822 1.02715
| [ 4 ]: -2.5647 -2.2925
| [ 5 ]: -0.331258 -0.129529
| [ 6 ]: -0.678444 -0.582798
| [ 7 ]: 0.362422 0.243937
| [ 8 ]: 5.70219 5.92325
| Wrongs: 125 ~0.02
| [ 0 ]: 0.285774 0.212167
| [ 2 ]: -0.58573 -0.744845
| [ 3 ]: -0.320327 -0.396461
| [ 4 ]: 0.276414 0.207184
| [ 6 ]: -0.805042 -0.957715
| [ 7 ]: -0.55411 -0.630006
| [ 8 ]: 0.263108 0.195007
| [ 10 ]: -0.837736 -1.00465
| [ 11 ]: 1.39087 1.32298
| Wrongs: 18 ~0.02
CUDNN vs TRT
| [ 0 ]: 0.26275 1.13652
| [ 1 ]: -0.604407 -0.723999
| [ 2 ]: -2.63829 -2.61438
| [ 3 ]: 1.02715 0.786822
| [ 4 ]: -2.2925 -2.5647
| [ 5 ]: -0.129528 -0.331258
| [ 6 ]: -0.582797 -0.678444
| [ 7 ]: 0.243937 0.362422
| [ 8 ]: 5.92325 5.70219
| Wrongs: 125 ~0.02
| [ 0 ]: 0.212167 0.285774
| [ 2 ]: -0.744845 -0.58573
| [ 3 ]: -0.396461 -0.320327
| [ 4 ]: 0.207184 0.276414
| [ 6 ]: -0.957715 -0.805042
| [ 7 ]: -0.630006 -0.55411
| [ 8 ]: 0.195007 0.263108
| [ 10 ]: -1.00465 -0.837736
| [ 11 ]: 1.32298 1.39087
| Wrongs: 18 ~0.02
---------------------------------------------------
Confidence CUDNN
0.977479 0.986367 0.97799 0.971273 0.974232 0.966501 0.971578 0.978977 0.974027 0.966054 0.970724 0.965549 0.971925 0.977542 0.9728 0.96913 0.968801 0.970229 0.973139 0.977897 0.973712 0.972364 0.969004 0.973687 0.97271 0.977654 0.973859 0.972518 0.969997 0.974779 0.971348 0.97871 0.974254 0.969607 0.970592 0.973098 0.968411 0.974884 0.970043 0.965388 0.96499 0.969276 0.972695 0.979673 0.974703 0.970112 0.970413 0.972816 0.97409 0.98164 0.974896 0.972365 0.970575 0.973586 0.972991 0.979948 0.974937 0.970246 0.970642 0.972096 0.967579 0.974887 0.970754 0.964808
Locations CUDNN
0.895468 1.02586 -3.26465 -1.48142 0.571279 0.503577 -1.00306 -0.205038 1.10684 0.168612 -6.14164 -1.78341 0.977233 1.70472 -3.54303 -2.92774 1.22893 0.0161056 -6.67807 -1.96537 0.632564 1.90412 -3.56912 -3.54311 0.355613 1.31305 -1.31578 -1.24236 0.740068 0.9646 -0.640562 -0.77711 0.988481 0.883645 -2.83893 -0.885611 -0.0565745 1.88677 -3.79773 -3.65738 1.09986 0.707278 -3.49885 -0.932392 -0.522438 1.9762 -3.57368 -4.03056 0.246989 1.50757 -0.316974 -1.59678 0.309205 1.23061 -0.0445376 -1.21936 0.477111 1.56142 -0.727183 -0.659404 0.112813 1.7245 -2.30625 -4.02627
---------------------------------------------------
Confidence tensorRT
0.975285 0.985188 0.976798 0.967267 0.97316 0.962462 0.970714 0.97868 0.97349 0.964915 0.970325 0.964382 0.970723 0.9759 0.971986 0.966283 0.968603 0.96712 0.973128 0.977473 0.974416 0.970923 0.970486 0.971926 0.97122 0.97548 0.972259 0.969485 0.968933 0.971035 0.971303 0.977997 0.973513 0.968757 0.96987 0.970581 0.971378 0.977987 0.972374 0.968555 0.96753 0.970285 0.976098 0.983278 0.977583 0.973809 0.973688 0.975522 0.977194 0.984369 0.977863 0.975222 0.97441 0.976219 0.976756 0.982267 0.977701 0.973338 0.974006 0.974096 0.97184 0.978243 0.974404 0.968246
Locations tensorRT
0.777498 1.04938 -3.02812 -1.38129 0.356407 0.499822 -1.15978 -0.138548 0.885423 0.423159 -5.3693 -1.81432 0.86543 1.63186 -3.15014 -2.73491 0.963911 0.195935 -5.65607 -1.83503 0.617409 1.79146 -3.1489 -3.38831 0.181724 1.29949 -1.74323 -1.32329 0.553508 0.916589 -0.672828 -0.59451 0.897496 0.805475 -3.33593 -1.02396 -0.33048 1.88618 -3.68143 -3.40749 1.01278 0.548814 -3.77655 -1.00644 -0.707318 2.00011 -3.4136 -3.77147 0.336892 1.38483 -0.523918 -1.35259 0.304683 1.10269 -0.145261 -0.930751 0.495511 1.32969 -1.04086 -0.517759 0.196518 1.75169 -2.70762 -3.82106
---------------------------------------------------
CUDNN vs TRT
| [ 342 ]: 0.963264 0.983403
| [ 347 ]: 0.953026 0.974363
| [ 2940 ]: 0.0357078 0.107297
| [ 2943 ]: 0.412983 0.694689
| [ 26940 ]: 0.829936 0.708185
| [ 26941 ]: 0.869318 0.797475
| [ 26943 ]: 0.506431 0.244212
| [ 26994 ]: 0.826695 0.757312
| [ 26995 ]: 0.811752 0.73976
| Wrongs: 21 ~0.02
| [ 0 ]: 0.895468 0.777498
| [ 1 ]: 1.02586 1.04938
| [ 2 ]: -3.26465 -3.02812
| [ 3 ]: -1.48142 -1.38129
| [ 4 ]: 0.571279 0.356407
| [ 6 ]: -1.00306 -1.15978
| [ 7 ]: -0.205038 -0.138548
| [ 8 ]: 1.10684 0.885423
| [ 9 ]: 0.168612 0.423159
| Wrongs: 9800 ~0.02
Also i tested this on windows before creating a pull request and had the same issue with mobilenet running in int8 mode ,i assumed it was fine due to difference between the result and ground truth in int8 mode.I am unable to test fp16 as my gpu doesnt support it
Hi, I'm compiling on a old tensorrt and i get an error on this line: https://github.com/ceccocats/tkDNN/blob/a638592fc74668471e87ac930b4695ce99dc7d43/src/NetworkRT.cpp#L143
Is the shared pointer necessary? without it it works fine on linux
No the shared pointer isn't necessary, I thought i had removed all instances of shared pointers and unique pointers that I had created I guess I must have missed this once place, do I create a new pull request to undo this shared pointer?
No the shared pointer isn't necessary, I thought i had removed all instances of shared pointers and unique pointers that I had created I guess I must have missed this once place, do I create a new pull request to undo this shared pointer?
fix in ba8199a03088b7c8e36066a1636a1237ab316cec
added support for windows 10,tested final code on windows using msvc 16.7 and gcc 9.3 on linux