mlcommons / inference_results_v0.5

This repository contains the results and code for the MLPerf™ Inference v0.5 benchmark.
https://mlcommons.org/en/inference-datacenter-05/
Apache License 2.0
55 stars 43 forks source link

Failed to create Engine for SSD-Mobilenet on DLA core #34

Closed sandip761 closed 4 years ago

sandip761 commented 4 years ago

python3: ../rtExt/dla/native/dlaExecuteRunner.cpp:123: virtual void nvinfer1::rt::dla::DLANativeRunner::allocateResources(const nvinfer1::rt::CommonContext&): Assertion `context.dlaContext->getDLACore() >= 0 && context.dlaContext->getDLACore() < mCore.numEngines()' failed. Traceback (most recent call last): File "code/main.py", line 328, in main() File "code/main.py", line 318, in main launch_handle_generate_engine(benchmark_name, benchmark_conf, need_gpu, need_dla) File "code/main.py", line 80, in launch_handle_generate_engine raise RuntimeError("Building engines failed!") RuntimeError: Building engines failed! Makefile:298: recipe for target 'generate_engines' failed make: *** [generate_engines] Error 1

nvpohanh commented 4 years ago

@sandip761 Could you try the suggested solution here: https://github.com/mlperf/inference_results_v0.5/issues/23#issuecomment-597358583 ? Let me know if that doesn't work. Thanks

sandip761 commented 4 years ago

I am able to create engine for dla core , After this while running for performance i got this errror.

/usr/src/tensorrt/bin/trtexec --int8 --loadEngine=build/engines/Xavier/ssd-small/MultiStream/ssd-small-MultiStream-dla-b8-int8.plan --batch=4 --iterations=1 --useDLACore=0 --allowGPUFallback --plugins=build/plugins/NMSOptPlugin/libnmsoptplugin.so [04/25/2020-16:46:26] [I] === Model Options === [04/25/2020-16:46:26] [I] Format: * [04/25/2020-16:46:26] [I] Model: [04/25/2020-16:46:26] [I] Output: [04/25/2020-16:46:26] [I] === Build Options === [04/25/2020-16:46:26] [I] Max batch: 4 [04/25/2020-16:46:26] [I] Workspace: 16 MB [04/25/2020-16:46:26] [I] minTiming: 1 [04/25/2020-16:46:26] [I] avgTiming: 8 [04/25/2020-16:46:26] [I] Precision: INT8 [04/25/2020-16:46:26] [I] Calibration: Dynamic [04/25/2020-16:46:26] [I] Safe mode: Disabled [04/25/2020-16:46:26] [I] Save engine: [04/25/2020-16:46:26] [I] Load engine: build/engines/Xavier/ssd-small/MultiStream/ssd-small-MultiStream-dla-b8-int8.plan [04/25/2020-16:46:26] [I] Inputs format: fp32:CHW [04/25/2020-16:46:26] [I] Outputs format: fp32:CHW [04/25/2020-16:46:26] [I] Input build shapes: model [04/25/2020-16:46:26] [I] === System Options === [04/25/2020-16:46:26] [I] Device: 0 [04/25/2020-16:46:26] [I] DLACore: 0(With GPU fallback) [04/25/2020-16:46:26] [I] Plugins: build/plugins/NMSOptPlugin/libnmsoptplugin.so [04/25/2020-16:46:26] [I] === Inference Options === [04/25/2020-16:46:26] [I] Batch: 4 [04/25/2020-16:46:26] [I] Iterations: 1 (200 ms warm up) [04/25/2020-16:46:26] [I] Duration: 10s [04/25/2020-16:46:26] [I] Sleep time: 0ms [04/25/2020-16:46:26] [I] Streams: 1 [04/25/2020-16:46:26] [I] Spin-wait: Disabled [04/25/2020-16:46:26] [I] Multithreading: Enabled [04/25/2020-16:46:26] [I] CUDA Graph: Disabled [04/25/2020-16:46:26] [I] Skip inference: Disabled [04/25/2020-16:46:26] [I] Input inference shapes: model [04/25/2020-16:46:26] [I] === Reporting Options === [04/25/2020-16:46:26] [I] Verbose: Disabled [04/25/2020-16:46:26] [I] Averages: 10 inferences [04/25/2020-16:46:26] [I] Percentile: 99 [04/25/2020-16:46:26] [I] Dump output: Disabled [04/25/2020-16:46:26] [I] Profile: Disabled [04/25/2020-16:46:26] [I] Export timing to JSON file: [04/25/2020-16:46:26] [I] Export profile to JSON file: [04/25/2020-16:46:26] [I] [04/25/2020-16:46:26] [I] Loading supplied plugin library: build/plugins/NMSOptPlugin/libnmsoptplugin.so NVMEDIA_DLA : 308, ERROR: setInputTensorDesc failed NVMEDIA_DLA : 446, ERROR: SetInputTensorDesc failed for tensor: 7. status: 0x0. NVMEDIA_DLA : 1914, ERROR: BindInputTensorArgs failed (Input). status: 0x7. [04/25/2020-16:46:30] [E] [TRT] ../rtExt/dla/native/dlaUtils.cpp (212) - DLA Error in submit: 7 (Failure to submit program to DLA engine.) [04/25/2020-16:46:30] [E] [TRT] FAILED_EXECUTION: std::exception NVMEDIA_DLA : 1599, ERROR: registerTensorBuffer failed [04/25/2020-16:46:30] [E] [TRT] ../rtExt/dla/native/dlaExecuteRunner.cpp (71) - DLA Error in execute: 7 (Input tensor register failed.) [04/25/2020-16:46:30] [E] [TRT] FAILED_EXECUTION: std::exception NVMEDIA_DLA : 1599, ERROR: registerTensorBuffer failed [04/25/2020-16:46:30] [E] [TRT] ../rtExt/dla/native/dlaExecuteRunner.cpp (71) - DLA Error in execute: 7 (Input tensor register failed.) [04/25/2020-16:46:30] [E] [TRT] FAILED_EXECUTION: std::exception NVMEDIA_DLA : 1599, ERROR: registerTensorBuffer failed [04/25/2020-16:46:30] [E] [TRT] ../rtExt/dla/native/dlaExecuteRunner.cpp (71) - DLA Error in execute: 7 (Input tensor register failed.) [04/25/2020-16:46:30] [E] [TRT] FAILED_EXECUTION: std::exception NVMEDIA_DLA : 1599, ERROR: registerTensorBuffer failed [04/25/2020-16:46:30] [E] [TRT] ../rtExt/dla/native/dlaExecuteRunner.cpp (71) - DLA Error in execute: 7 (Input tensor register failed.) [04/25/2020-16:46:30] [E] [TRT] FAILED_EXECUTION: std::exception NVMEDIA_DLA : 1599, ERROR: registerTensorBuffer failed [04/25/2020-16:46:30] [E] [TRT] ../rtExt/dla/native/dlaExecuteRunner.cpp (71) - DLA Error in execute: 7 (Input tensor register failed.) [04/25/2020-16:46:30] [E] [TRT] FAILED_EXECUTION: std::exception NVMEDIA_DLA : 1599, ERROR: registerTensorBuffer failed [04/25/2020-16:46:30] [E] [TRT] ../rtExt/dla/native/dlaExecuteRunner.cpp (71) - DLA Error in execute: 7 (Input tensor register failed.) [04/25/2020-16:46:30] [E] [TRT] FAILED_EXECUTION: std::exception NVMEDIA_DLA : 1599, ERROR: registerTensorBuffer failed [04/25/2020-16:46:30] [E] [TRT] ../rtExt/dla/native/dlaExecuteRunner.cpp (71) - DLA Error in execute: 7 (Input tensor register failed.) [04/25/2020-16:46:30] [E] [TRT] FAILED_EXECUTION: std::exception NVMEDIA_DLA : 1599, ERROR: registerTensorBuffer failed [04/25/2020-16:46:30] [E] [TRT] ../rtExt/dla/native/dlaExecuteRunner.cpp (71) - DLA Error in execute: 7 (Input tensor register failed.) [04/25/2020-16:46:30] [E] [TRT] FAILED_EXECUTION: std::exception NVMEDIA_DLA : 1599, ERROR: registerTensorBuffer failed [04/25/2020-16:46:30] [E] [TRT] ../rtExt/dla/native/dlaExecuteRunner.cpp (71) - DLA Error in execute: 7 (Input tensor register failed.) [04/25/2020-16:46:30] [E] [TRT] FAILED_EXECUTION: std::exception [04/25/2020-16:46:30] [I] Average over 10 runs is 0.465181 ms (host walltime is 0.517853 ms, 99% percentile time is 1.42486). trtexec: ../rtExt/dla/nvmItem.cpp:22: nvinfer1::rt::dla::NvmItem::~NvmItem(): Assertion `!mCudaMemory || !mNvmTensor' failed. Aborted (core dumped)

nvpohanh commented 4 years ago

@sandip761 You built the DLA engine with BS=8 but ran it with BS=4. DLA does not support dynamic batch size.