NVIDIA-AI-IOT / jetson_benchmarks

Jetson Benchmark
MIT License
350 stars 70 forks source link

Error in Build #1

Closed roborocklsm closed 4 years ago

roborocklsm commented 4 years ago

When I run this benchmark on Jetson AGX Xavier with jetpack 4.3 and TensorRT6, it gives back some errors.

sudo python3 benchmark.py --model_name ssd-mobilenet-v1  --csv_file_path ./benchmark_csv/xavier-benchmarks.csv --model_dir /abs/dir/to/models/ --jetson_devkit xavier --gpu_freq 1377000000 --dla_freq 1395200000 --power_mode 0

It returns

Please close all other applications and Press Enter to continue...
Setting Jetson xavier in max performance mode
gpu frequency is set from 1198500000 Hz --> to 1377000000 Hz
dla frequency is set from 1395200000 Hz --> to 1395200000 Hz
------------Executing ResNet50_224x224------------

Error in Build, Please check the log in: /home/shl666/repo/jetson_benchmarks/models/
Error in Build, Please check the log in: /home/shl666/repo/jetson_benchmarks/models/
Error in Build, Please check the log in: /home/shl666/repo/jetson_benchmarks/models/
We recommend to run benchmarking in headless mode
--------------------------

Model Name: ResNet50_224x224 
FPS:0.00 

--------------------------

And I checked the log:

&&&& RUNNING TensorRT.trtexec # ./trtexec --output=prob --deploy=/home/shl666/repo/jetson_benchmarks/models/ResNet50_224x224.prototxt --batch=16 --int8 --workspace=2048 --avgRuns=100 --duration=180 --loadEngine=/home/shl666/repo/jetson_benchmarks/models/ResNet50_224x224_b16_ws2048_gpu.engine
[04/25/2020-11:03:22] [I] === Model Options ===
[04/25/2020-11:03:22] [I] Format: Caffe
[04/25/2020-11:03:22] [I] Model:
[04/25/2020-11:03:22] [I] Prototxt: /home/shl666/repo/jetson_benchmarks/models/ResNet50_224x224.prototxtOutput: prob
[04/25/2020-11:03:22] [I] === Build Options ===
[04/25/2020-11:03:22] [I] Max batch: 16
[04/25/2020-11:03:22] [I] Workspace: 2048 MB
[04/25/2020-11:03:22] [I] minTiming: 1
[04/25/2020-11:03:22] [I] avgTiming: 8
[04/25/2020-11:03:22] [I] Precision: INT8
[04/25/2020-11:03:22] [I] Calibration: Dynamic
[04/25/2020-11:03:22] [I] Safe mode: Disabled
[04/25/2020-11:03:22] [I] Save engine:
[04/25/2020-11:03:22] [I] Load engine: /home/shl666/repo/jetson_benchmarks/models/ResNet50_224x224_b16_ws2048_gpu.engine
[04/25/2020-11:03:22] [I] Inputs format: fp32:CHW
[04/25/2020-11:03:22] [I] Outputs format: fp32:CHW
[04/25/2020-11:03:22] [I] Input build shapes: model
[04/25/2020-11:03:22] [I] === System Options ===
[04/25/2020-11:03:22] [I] Device: 0
[04/25/2020-11:03:22] [I] DLACore:
[04/25/2020-11:03:22] [I] Plugins:
[04/25/2020-11:03:22] [I] === Inference Options ===
[04/25/2020-11:03:22] [I] Batch: 16
[04/25/2020-11:03:22] [I] Iterations: 10 (200 ms warm up)
[04/25/2020-11:03:22] [I] Duration: 180s
[04/25/2020-11:03:22] [I] Sleep time: 0ms
[04/25/2020-11:03:22] [I] Streams: 1
[04/25/2020-11:03:22] [I] Spin-wait: Disabled
[04/25/2020-11:03:22] [I] Multithreading: Enabled
[04/25/2020-11:03:22] [I] CUDA Graph: Disabled
[04/25/2020-11:03:22] [I] Skip inference: Disabled
[04/25/2020-11:03:22] [I] Input inference shapes: model
[04/25/2020-11:03:22] [I] === Reporting Options ===
[04/25/2020-11:03:22] [I] Verbose: Disabled
[04/25/2020-11:03:22] [I] Averages: 100 inferences
[04/25/2020-11:03:22] [I] Percentile: 99
[04/25/2020-11:03:22] [I] Dump output: Disabled
[04/25/2020-11:03:22] [I] Profile: Disabled
[04/25/2020-11:03:22] [I] Export timing to JSON file:
[04/25/2020-11:03:22] [I] Export profile to JSON file:
[04/25/2020-11:03:22] [I]
[04/25/2020-11:03:27] [I] Average over 100 runs is 11.8268 ms (host walltime is 11.9357 ms, 99% percentile time is 12.6808).
[04/25/2020-11:03:28] [I] Average over 100 runs is 11.818 ms (host walltime is 11.9305 ms, 99% percentile time is 11.9069).
[04/25/2020-11:03:29] [I] Average over 100 runs is 11.8119 ms (host walltime is 11.9257 ms, 99% percentile time is 11.8986).
[04/25/2020-11:03:30] [I] Average over 100 runs is 11.805 ms (host walltime is 11.9108 ms, 99% percentile time is 11.8945).
[04/25/2020-11:03:32] [I] Average over 100 runs is 11.7994 ms (host walltime is 11.9023 ms, 99% percentile time is 11.8988).
[04/25/2020-11:03:33] [I] Average over 100 runs is 12.4433 ms (host walltime is 12.5799 ms, 99% percentile time is 17.0742).
[04/25/2020-11:03:34] [I] Average over 100 runs is 12.2739 ms (host walltime is 12.4166 ms, 99% percentile time is 14.0012).
[04/25/2020-11:03:35] [I] Average over 100 runs is 12.5842 ms (host walltime is 12.7295 ms, 99% percentile time is 13.981).
[04/25/2020-11:03:37] [I] Average over 100 runs is 12.0611 ms (host walltime is 12.1768 ms, 99% percentile time is 14.6729).
[04/25/2020-11:03:38] [I] Average over 100 runs is 15.8314 ms (host walltime is 16.2859 ms, 99% percentile time is 21.2603).
&&&& PASSED TensorRT.trtexec # ./trtexec --output=prob --deploy=/home/shl666/repo/jetson_benchmarks/models/ResNet50_224x224.prototxt --batch=16 --int8 --workspace=2048 --avgRuns=100 --duration=180 --loadEngine=/home/shl666/repo/jetson_benchmarks/models/ResNet50_224x224_b16_ws2048_gpu.engine

It seems has already done the benchmark stuff but failed with parsing them from the log.

And for ssd-mobilenet-v1, it seems unable to create the trt engine.

&&&& RUNNING TensorRT.trtexec # ./trtexec --onnx=/home/shl666/repo/jetson_benchmarks/models/ssd-mobilenet-v1-bs16.onnx --explicitBatch --int8 --workspace=2048 --avgRuns=100 --duration=180 --loadEngine=/home/shl666/repo/jetson_benchmarks/models/ssd-mobilenet-v1_b16_ws2048_gpu.engine
[04/25/2020-11:35:30] [I] === Model Options ===
[04/25/2020-11:35:30] [I] Format: ONNX
[04/25/2020-11:35:30] [I] Model: /home/shl666/repo/jetson_benchmarks/models/ssd-mobilenet-v1-bs16.onnx
[04/25/2020-11:35:30] [I] Output:
[04/25/2020-11:35:30] [I] === Build Options ===
[04/25/2020-11:35:30] [I] Max batch: explicit
[04/25/2020-11:35:30] [I] Workspace: 2048 MB
[04/25/2020-11:35:30] [I] minTiming: 1
[04/25/2020-11:35:30] [I] avgTiming: 8
[04/25/2020-11:35:30] [I] Precision: INT8
[04/25/2020-11:35:30] [I] Calibration: Dynamic
[04/25/2020-11:35:30] [I] Safe mode: Disabled
[04/25/2020-11:35:30] [I] Save engine:
[04/25/2020-11:35:30] [I] Load engine: /home/shl666/repo/jetson_benchmarks/models/ssd-mobilenet-v1_b16_ws2048_gpu.engine
[04/25/2020-11:35:30] [I] Inputs format: fp32:CHW
[04/25/2020-11:35:30] [I] Outputs format: fp32:CHW
[04/25/2020-11:35:30] [I] Input build shapes: model
[04/25/2020-11:35:30] [I] === System Options ===
[04/25/2020-11:35:30] [I] Device: 0
[04/25/2020-11:35:30] [I] DLACore:
[04/25/2020-11:35:30] [I] Plugins:
[04/25/2020-11:35:30] [I] === Inference Options ===
[04/25/2020-11:35:30] [I] Batch: Explicit
[04/25/2020-11:35:30] [I] Iterations: 10 (200 ms warm up)
[04/25/2020-11:35:30] [I] Duration: 180s
[04/25/2020-11:35:30] [I] Sleep time: 0ms
[04/25/2020-11:35:30] [I] Streams: 1
[04/25/2020-11:35:30] [I] Spin-wait: Disabled
[04/25/2020-11:35:30] [I] Multithreading: Enabled
[04/25/2020-11:35:30] [I] CUDA Graph: Disabled
[04/25/2020-11:35:30] [I] Skip inference: Disabled
[04/25/2020-11:35:30] [I] === Reporting Options ===
[04/25/2020-11:35:30] [I] Verbose: Disabled
[04/25/2020-11:35:30] [I] Averages: 100 inferences
[04/25/2020-11:35:30] [I] Percentile: 99
[04/25/2020-11:35:30] [I] Dump output: Disabled
[04/25/2020-11:35:30] [I] Profile: Disabled
[04/25/2020-11:35:30] [I] Export timing to JSON file:
[04/25/2020-11:35:30] [I] Export profile to JSON file:
[04/25/2020-11:35:30] [I]
[04/25/2020-11:35:30] [E] Error opening engine file: /home/shl666/repo/jetson_benchmarks/models/ssd-mobilenet-v1_b16_ws2048_gpu.engine
[04/25/2020-11:35:30] [E] Engine could not be created
&&&& FAILED TensorRT.trtexec # ./trtexec --onnx=/home/shl666/repo/jetson_benchmarks/models/ssd-mobilenet-v1-bs16.onnx --explicitBatch --int8 --workspace=2048 --avgRuns=100 --duration=180 --loadEngine=/home/shl666/repo/jetson_benchmarks/models/ssd-mobilenet-v1_b16_ws2048_gpu.engine
juliansjungwirth commented 4 years ago

Hi, I am encountering the very same issues for a Jetson Xavier NX with Jetpack v4.4 and TensorRT 7 what are the expected dependencies. Looking forward to running the benchmarks ;) Thx.

yxxxqqq commented 4 years ago

I met the error also, did you know how to fix it?

ak-nv commented 4 years ago

@roborocklsm Sorry for the delayed reply. Currently, we targeted benchmark script for JP4.4 with TRT7, however, you can modify : function read_perf_time in utils/read_write_data.py script.

Please let me know if you see any issues.

ak-nv commented 4 years ago

Hi, I am encountering the very same issues for a Jetson Xavier NX with Jetpack v4.4 and TensorRT 7 what are the expected dependencies. Looking forward to running the benchmarks ;) Thx.

Could you please check if you have log file in models directory? If yes, does that show any signs of engine ran successfully? as @roborocklsm output? Please copy paste your error message.

yxxxqqq commented 4 years ago

&&&& RUNNING TensorRT.trtexec # ./trtexec --output=Mconv7_stage2_L2 --deploy=nx-models/pose_estimation.prototxt --batch=2 --int8 --workspace=2048 --avgRuns=100 --duration=180 --loadEngine=nx-models/pose_estimation_b2_ws2048_gpu.engine [06/04/2020-21:21:28] [I] === Model Options === [06/04/2020-21:21:28] [I] Format: Caffe [06/04/2020-21:21:28] [I] Model: [06/04/2020-21:21:28] [I] Prototxt: nx-models/pose_estimation.prototxt [06/04/2020-21:21:28] [I] Output: Mconv7_stage2_L2 [06/04/2020-21:21:28] [I] === Build Options === [06/04/2020-21:21:28] [I] Max batch: 2 [06/04/2020-21:21:28] [I] Workspace: 2048 MB [06/04/2020-21:21:28] [I] minTiming: 1 [06/04/2020-21:21:28] [I] avgTiming: 8 [06/04/2020-21:21:28] [I] Precision: FP32+INT8 [06/04/2020-21:21:28] [I] Calibration: Dynamic [06/04/2020-21:21:28] [I] Safe mode: Disabled [06/04/2020-21:21:28] [I] Save engine: [06/04/2020-21:21:28] [I] Load engine: nx-models/pose_estimation_b2_ws2048_gpu.engine [06/04/2020-21:21:28] [I] Builder Cache: Enabled [06/04/2020-21:21:28] [I] NVTX verbosity: 0 [06/04/2020-21:21:28] [I] Inputs format: fp32:CHW [06/04/2020-21:21:28] [I] Outputs format: fp32:CHW [06/04/2020-21:21:28] [I] Input build shapes: model [06/04/2020-21:21:28] [I] Input calibration shapes: model [06/04/2020-21:21:28] [I] === System Options === [06/04/2020-21:21:28] [I] Device: 0 [06/04/2020-21:21:28] [I] DLACore: [06/04/2020-21:21:28] [I] Plugins: [06/04/2020-21:21:28] [I] === Inference Options === [06/04/2020-21:21:28] [I] Batch: 2 [06/04/2020-21:21:28] [I] Input inference shapes: model [06/04/2020-21:21:28] [I] Iterations: 10 [06/04/2020-21:21:28] [I] Duration: 180s (+ 200ms warm up) [06/04/2020-21:21:28] [I] Sleep time: 0ms [06/04/2020-21:21:28] [I] Streams: 1 [06/04/2020-21:21:28] [I] ExposeDMA: Disabled [06/04/2020-21:21:28] [I] Spin-wait: Disabled [06/04/2020-21:21:28] [I] Multithreading: Disabled [06/04/2020-21:21:28] [I] CUDA Graph: Disabled [06/04/2020-21:21:28] [I] Skip inference: Disabled [06/04/2020-21:21:28] [I] Inputs: [06/04/2020-21:21:28] [I] === Reporting Options === [06/04/2020-21:21:28] [I] Verbose: Disabled [06/04/2020-21:21:28] [I] Averages: 100 inferences [06/04/2020-21:21:28] [I] Percentile: 99 [06/04/2020-21:21:28] [I] Dump output: Disabled [06/04/2020-21:21:28] [I] Profile: Disabled [06/04/2020-21:21:28] [I] Export timing to JSON file: [06/04/2020-21:21:28] [I] Export output to JSON file: [06/04/2020-21:21:28] [I] Export profile to JSON file: [06/04/2020-21:21:28] [I] [06/04/2020-21:21:28] [E] Error opening engine file: nx-models/pose_estimation_b2_ws2048_gpu.engine [06/04/2020-21:21:28] [E] Engine creation failed [06/04/2020-21:21:28] [E] Engine set up failed &&&& FAILED TensorRT.trtexec # ./trtexec --output=Mconv7_stage2_L2 --deploy=nx-models/pose_estimation.prototxt --batch=2 --int8 --workspace=2048 --avgRuns=100 --duration=180 --loadEngine=nx-models/pose_estimation_b2_ws2048_gpu.engine

ak-nv commented 4 years ago

@yxxxqqq Thank you for the log. Following are my two suggestions:

  1. Can you make sure; when you run the benchmark; your ram usage is less than 500MB. ie before running the benchmark check "free -m" and look for "used" category.
  2. Also, you can also try setting workspace size as 1024 or 512, you can do this, by editing benchmark_csv/nx-benchmarks.csv csv file.
Zulkhuu commented 4 years ago

@yxxxqqq It looks like your argument to --model_dir is a relative path. I have been getting similar error log as yours. When I changed relative path to absolute path, there was no issue.

PS: I'm using JP4.4 & AGX Xavier.

ak-nv commented 4 years ago

Thanks @Zulkhuu for looking at it. I am closing this issue due to inactivity for more than 15days.

8087482204 commented 2 years ago

I am also facing same issue what is the recommended solution for this as i am not able to get output and getting same errors

MohanaRC commented 2 years ago

I'm facing the same issue. The model directory path is absolute in my case. Here's a screenshot,

image