Closed maria-18-git closed 1 year ago
Updated and check:
maria@chai ~/axs/core_collection/workflows_collection/object_detection/object_detection_using_onnxrt_py (master *=)$ axs byname object_detection_using_onnxrt_py , get onnxruntime_name
WARNING:root:[kilt-mlperf-dev] parameters file /home/maria/work_collection/kilt-mlperf-dev/data_axs.json did not exist, initializing to empty parameters
WARNING:root:shell.run() about to execute (with env=None, in_dir=None, capture_output=True, errorize_output=False, capture_stderr=False, split_to_lines=False):
"/usr/bin/nvidia-smi" -L | grep -c GPU
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
onnxruntime-gpu
maria@chai ~/axs/core_collection/workflows_collection/object_detection/object_detection_using_onnxrt_py (master *=)$ axs byname object_detection_using_onnxrt_py , get num_gpus
WARNING:root:[kilt-mlperf-dev] parameters file /home/maria/work_collection/kilt-mlperf-dev/data_axs.json did not exist, initializing to empty parameters
WARNING:root:shell.run() about to execute (with env=None, in_dir=None, capture_output=True, errorize_output=False, capture_stderr=False, split_to_lines=False):
"/usr/bin/nvidia-smi" -L | grep -c GPU
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
2
Run accuracy:
maria@chai ~/axs/core_collection/workflows_collection/object_detection/object_detection_using_onnxrt_py (master *=)$ axs byquery program_output,task=object_detection,framework=onnxrt
...
2023-09-25 11:38:22.837671084 [W:onnxruntime:, graph.cc:1283 Graph] Initializer backbone.model.layer2.0.5.conv2.weight appears in graph inputs and will not be treated as constant value/weight. This may prevent some of the graph optimizations, like const folding. Move it out of graph inputs if there is no need to override it, by either re-generating the model with latest exporter/converter or with the tool onnxruntime/tools/python/remove_initializer_from_input.py. 2023-09-25 11:38:29.080452825 [W:onnxruntime:Default, tensorrt_execution_provider.h:77 log] [2023-09-25 10:38:29 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2023-09-25 11:38:29.080492523 [W:onnxruntime:Default, tensorrt_execution_provider.h:77 log] [2023-09-25 10:38:29 WARNING] onnx2trt_utils.cpp:400: One or more weights outside the range of INT32 was clamped 2023-09-25 11:38:29.249120983 [W:onnxruntime:Default, tensorrt_execution_provider.h:77 log] [2023-09-25 10:38:29 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32. 2023-09-25 11:38:29.249156886 [W:onnxruntime:Default, tensorrt_execution_provider.h:77 log] [2023-09-25 10:38:29 WARNING] onnx2trt_utils.cpp:400: One or more weights outside the range of INT32 was clamped
2023-09-25 11:38:29.436545121 [E:onnxruntime:, inference_session.cc:1785 operator()] Exception during initialization: /onnxruntime_src/onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc:1552 SubGraphCollection_t onnxruntime::TensorrtExecutionProvider::GetSupportedList(SubGraphCollection_t, int, int, const onnxruntime::GraphViewer&, bool*) const [ONNXRuntimeError] : 1 : FAIL : TensorRT input: NonMaxSuppression_683 has no shape specified. Please run shape inference on the onnx model first. Details can be found in https://onnxruntime.ai/docs/execution-providers/TensorRT-ExecutionProvider.html#shape-inference-for-tensorrt-subgraphs
Traceback (most recent call last):
File "/home/maria/axs/core_collection/workflows_collection/object_detection/object_detection_using_onnxrt_py/onnx_detect.py", line 257, in <module> main() File "/home/maria/axs/core_collection/workflows_collection/object_detection/object_detection_using_onnxrt_py/onnx_detect.py", line 110, in main
sess = rt.InferenceSession(model_path, sess_options, providers= [requested_provider] if execution_device else rt.get_available_providers())
File "/home/maria/work_collection/onnxruntime-gpu_package_for_python3.9/install/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 419, in __init__ self._create_inference_session(providers, provider_options, disabled_optimizers)
File "/home/maria/work_collection/onnxruntime-gpu_package_for_python3.9/install/lib/python3.9/site-packages/onnxruntime/capi/onnxruntime_inference_collection.py", line 471, in _create_inference_session
sess.initialize_session(providers, provider_options, disabled_optimizers)
onnxruntime.capi.onnxruntime_pybind11_state.RuntimeException: [ONNXRuntimeError] : 6 : RUNTIME_EXCEPTION : Exception during initialization: /onnxruntime_src/onnxruntime/core/providers/tensorrt/tensorrt_execution_provider.cc:1552 SubGraphCollection_t onnxruntime::TensorrtExecutionProvider::GetSupportedList(SubGraphCollection_t, int, int, const onnxruntime::GraphViewer&, bool*) const [ONNXRuntimeError] : 1 : FAIL : TensorRT input: NonMaxSuppression_683 has no shape specified. Please run shape inference on the onnx model first. Details can be found in https://onnxruntime.ai/docs/execution-providers/TensorRT-ExecutionProvider.html#shape-inference-for-tensorrt-subgraphs
On chai
we have gpu, so we can you cuda
and tensorrt
. But now we support only cuda
in our programs.
So we need to run on chai
setting execution_device to cpu
, gpu
or cuda
.
maria@chai ~/axs/core_collection/workflows_collection/object_detection/object_detection_using_onnxrt_py (master *=)$ axs byquery program_output,task=object_detection,framework=onnxrt,execution_device=cuda , get accuracy
WARNING:root:[kilt-mlperf-dev] parameters file /home/maria/work_collection/kilt-mlperf-dev/data_axs.json did not exist, initializing to empty parameters
WARNING:root:[base_object_detection_experiment] touch _BEFORE_CODE_LOADING=/home/maria/work_collection/pycocotools_package_for_python3.9/install/lib/python3.9/site-packages
loading annotations into memory...
Done (t=0.36s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.18s).
Accumulating evaluation results...
DONE (t=0.11s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.230
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.402
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.232
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.155
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.467
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.316
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.247
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.333
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.347
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.209
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.540
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.408
0.23015432623401041
maria@chai ~/axs/core_collection/workflows_collection/object_detection/object_detection_using_onnxrt_py (master *=)$ axs byquery program_output,task=object_detection,framework=onnxrt,execution_device=gpu , get accuracy
WARNING:root:[kilt-mlperf-dev] parameters file /home/maria/work_collection/kilt-mlperf-dev/data_axs.json did not exist, initializing to empty parameters
WARNING:root:[base_object_detection_experiment] touch _BEFORE_CODE_LOADING=/home/maria/work_collection/pycocotools_package_for_python3.9/install/lib/python3.9/site-packages
loading annotations into memory...
Done (t=0.38s)
creating index...
index created!
Loading and preparing results...
DONE (t=0.00s)
creating index...
index created!
Running per image evaluation...
Evaluate annotation type *bbox*
DONE (t=0.18s).
Accumulating evaluation results...
DONE (t=0.13s).
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.230
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.403
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.232
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.155
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.467
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.316
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.247
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.333
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.347
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.209
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.540
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.408
0.23014983711985593
image classification
maria@chai ~/axs/core_collection/workflows_collection/image_classification/image_classification_using_onnxrt_py (master *=)$ axs byname image_classification_using_onnxrt_py , run $ONNXRUNTIME_QUERY_MOD ---capture_output=false --output_file_path=
...
2023-09-25 13:53:58.637326721 [W:onnxruntime:Default, tensorrt_execution_provider.h:77 log] [2023-09-25 12:53:58 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
2023-09-25 13:53:58.644886943 [W:onnxruntime:Default, tensorrt_execution_provider.h:77 log] [2023-09-25 12:53:58 WARNING] Tensor DataType is determined at build time for tensors not marked as input or output.
2023-09-25 13:54:05.240398741 [W:onnxruntime:Default, tensorrt_execution_provider.h:77 log] [2023-09-25 12:54:05 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
2023-09-25 13:54:05.242841622 [W:onnxruntime:Default, tensorrt_execution_provider.h:77 log] [2023-09-25 12:54:05 WARNING] Tensor DataType is determined at build time for tensors not marked as input or output.
Session execution provider: ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'AzureExecutionProvider', 'CPUExecutionProvider']
Device: GPU
input_layer_names=['input_tensor:0']
output_layer_name=softmax_tensor:0
model_input_shape=['unk__271', 3, 224, 224]
model_output_shape=['unk__273', 1001]
batch 1/1: (1..20) [65, 795, 230, 809, 516, 67, 334, 415, 674, 332, 109, 286, 370, 757, 595, 147, 327, 23, 478, 517]
0
Accuracy:
maria@chai ~/axs/core_collection/workflows_collection/image_classification/image_classification_using_onnxrt_py (master *=)$ axs byquery program_output,task=image_classification,framework=onnxrt,num_of_images=32 , get accuracy
...
2023-09-25 13:56:50.357322822 [W:onnxruntime:Default, tensorrt_execution_provider.h:77 log] [2023-09-25 12:56:50 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
2023-09-25 13:56:50.360067336 [W:onnxruntime:Default, tensorrt_execution_provider.h:77 log] [2023-09-25 12:56:50 WARNING] Tensor DataType is determined at build time for tensors not marked as input or output.
2023-09-25 13:56:56.544946851 [W:onnxruntime:Default, tensorrt_execution_provider.h:77 log] [2023-09-25 12:56:56 WARNING] onnx2trt_utils.cpp:374: Your ONNX model has been generated with INT64 weights, while TensorRT does not natively support INT64. Attempting to cast down to INT32.
2023-09-25 13:56:56.547443802 [W:onnxruntime:Default, tensorrt_execution_provider.h:77 log] [2023-09-25 12:56:56 WARNING] Tensor DataType is determined at build time for tensors not marked as input or output.
Session execution provider: ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'AzureExecutionProvider', 'CPUExecutionProvider']
Device: GPU
input_layer_names=['input_tensor:0']
output_layer_name=softmax_tensor:0
model_input_shape=['unk__271', 3, 224, 224]
model_output_shape=['unk__273', 1001]
batch 1/2: (1..25) [65, 795, 230, 809, 516, 67, 334, 415, 674, 332, 109, 286, 370, 757, 595, 147, 327, 23, 478, 517, 334, 176, 948, 727, 23]
...
0.84375
image_classification_using_pytorch_py
maria@chai ~/axs/core_collection/workflows_collection/image_classification (master *=)$ axs byquery program_output,task=image_classification,framework=pytorch,num_of_images=32 , get accuracy
...
Device: GPU
Using cache found in /home/maria/.cache/torch/hub/pytorch_vision_v0.12.0
batch 1/2: (1..25) [65, 795, 230, 809, 520, 65, 334, 852, 674, 332, 109, 286, 370, 757, 595, 147, 327, 23, 478, 517, 334, 172, 948, 727, 23]
batch 2/2: (26..32) [583, 270, 264, 55, 538, 324, 573]
...
0.71875
BERT
bert_demo_torch_py
maria@chai ~/axs/core_collection/workflows_collection/bert (master *=)$ export BERT_DEMO_OUTPUT=`axs byname bert_demo_torch_py , run --capture_output+`
...
Context taken from '/home/maria/axs/core_collection/workflows_collection/bert/bert_demo_torch_py/context.txt':
Moscow is the capital of the Soviet Union and the largest city in the country. More than 800 years old, Moscow has long been one of the world's great cultural centers. Home of the famed Bolshoi Theater of Opera and Ballet, the city also boasts 150 museums and exhibits of culture. In the heart of Moscow is the Kremlin, a fortress surrounded by red stone walls - inside the walls are palaces, cathedrals and buildings housing the seat of the Soviet government.
----------------------------------------------------------------
Questions taken from '/home/maria/axs/core_collection/workflows_collection/bert/bert_demo_torch_py/questions.txt':
maria@chai ~/axs/core_collection/workflows_collection/bert (master *=)$ echo $BERT_DEMO_OUTPUT
Question_1: Which country has Moscow as the capital? Answer_1: the soviet union Question_2: How old is the capital of the Soviet Union? Answer_2: more than 800 years Question_3: Where is the Bolshoi Theater? Answer_3: moscow Question_4: How many museums are there in the capital of the Soviet Union? Answer_4: 150 Question_5: What is the Kremlin? Answer_5: a fortress surrounded by red stone walls Question_6: Where is the Kremlin? Answer_6: in the heart of moscow Question_7: What is inside the Kremlin? Answer_7: palaces , cathedrals and buildings housing the seat of the soviet government Question_8: What colour are the stones of the Kremlin? Answer_8: red
bert_using_onnxrt_py
maria@chai ~/axs/core_collection/workflows_collection/bert (master *=)$ export ACCURACY_OUTPUT=`axs byquery program_output,task=bert,framework=onnxrt,batch_count=20 , get accuracy_dict`
maria@chai ~/axs/core_collection/workflows_collection/bert (master *=)$ echo "Accuracy: $ACCURACY_OUTPUT" Accuracy: {'exact_match': 85.0, 'f1': 85.0}
image_classification_using_onnxrt_loadgen
maria@chai ~/work_collection/axs2mlperf/image_classification_using_onnxrt_loadgen (master *=)$ axs byquery loadgen_output,task=image_classification,framework=onnxrt,loadgen_scenario=Offline,loadgen_mode=AccuracyOnly,model_name=resnet50,loadgen_dataset_size=20,loadgen_buffer_size=100,execution_device=cpu
...
Session execution provider: ['CPUExecutionProvider']
Device: CPU
load_query_samples([8, 1, 17, 12, 11, 10, 5, 15, 19, 9, 18, 6, 7, 2, 13, 14, 16, 4, 0, 3])
B20llllllllllllllllllll
Q20[batch of 1] inference=10.21 ms
...
Accuracy:
maria@chai ~/work_collection/axs2mlperf/image_classification_using_onnxrt_loadgen (master *=)$ export ACCURACY_OUTPUT=`axs byquery loadgen_output,task=image_classification,framework=onnxrt,loadgen_scenario=Offline,loadgen_mode=AccuracyOnly,model_name=resnet50,loadgen_dataset_size=20,loadgen_buffer_size=100,execution_device=cpu , get accuracy `
maria@chai ~/work_collection/axs2mlperf/image_classification_using_onnxrt_loadgen (master *=)$ echo "Accuracy: $ACCURACY_OUTPUT" Accuracy: 85.0
GPU:
maria@chai ~/work_collection/axs2mlperf/image_classification_using_onnxrt_loadgen (master *=)$ axs byquery loadgen_output,task=image_classification,framework=onnxrt,loadgen_scenario=Offline,loadgen_mode=AccuracyOnly,model_name=resnet50,loadgen_dataset_size=20,loadgen_buffer_size=100,execution_device=gpu , get accuracy ... Session execution provider: ['TensorrtExecutionProvider', 'CUDAExecutionProvider', 'AzureExecutionProvider', 'CPUExecutionProvider'] Device: GPU load_query_samples([8, 1, 17, 12, 11, 10, 5, 15, 19, 9, 18, 6, 7, 2, 13, 14, 16, 4, 0, 3]) B20llllllllllllllllllll ... 85.0
image_classification_using_torch_loadgen
maria@chai ~/work_collection/axs2mlperf/image_classification_using_torch_loadgen (master *=)$ export ACCURACY_OUTPUT=`axs byquery loadgen_output,task=image_classification,framework=pytorch,loadgen_scenario=Offline,loadgen_mode=AccuracyOnly,model_name=resnet50,loadgen_dataset_size=20,loadgen_buffer_size=100,execution_device=cpu , get accuracy`
maria@chai ~/work_collection/axs2mlperf/image_classification_using_torch_loadgen (master *=)$ echo "Accuracy: $ACCURACY_OUTPUT" Accuracy: 75.0
object_detection_using_onnxrt_loadgen
maria@chai ~/work_collection/axs2mlperf/object_detection_using_onnxrt_loadgen (master *=)$ export ACCURACY_OUTPUT=`axs byquery loadgen_output,task=object_detection,framework=onnxrt,loadgen_scenario=Offline,loadgen_mode=AccuracyOnly,model_name=ssd_resnet34,loadgen_dataset_size=20,loadgen_buffer_size=100,execution_device=cpu , get accuracy`
maria@chai ~/work_collection/axs2mlperf/object_detection_using_onnxrt_loadgen (master *=)$ echo "Accuracy: $ACCURACY_OUTPUT" Accuracy: 22.852
object_detection_using_onnxrt_loadgen
maria@chai ~/work_collection/axs2mlperf/object_detection_using_onnxrt_loadgen (master *=)$ axs byquery loadgen_output,task=bert,framework=onnxrt,loadgen_scenario=Offline,loadgen_mode=AccuracyOnly,execution_device=cpu
...
Session execution provider: ['CPUExecutionProvider']
Device: CPU
Loading tokenized SQuAD dataset as features from /home/maria/work_collection/preprocessed_squad_v1_1_msl_384_calibration_no/preprocessed_squad_v1.1.pickled ...
2023-09-25 23:56:46.456661: I tensorflow/core/util/port.cc:110] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2023-09-25 23:56:46.490513: I tensorflow/core/platform/cpu_feature_guard.cc:182] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.
Example width: 384
...
Accuracy:
maria@chai ~/work_collection/axs2mlperf/object_detection_using_onnxrt_loadgen (master *=)$ export ACCURACY_OUTPUT=`axs byquery loadgen_output,task=bert,framework=onnxrt,loadgen_scenario=Offline,loadgen_mode=AccuracyOnly,execution_device=cpu , get accuracy_dict`
maria@chai ~/work_collection/axs2mlperf/object_detection_using_onnxrt_loadgen (master *=)$ echo "Accuracy: $ACCURACY_OUTPUT" Accuracy: {'exact_match': 85.0, 'f1': 85.0}
supported_execution_providers
):supported_execution_providers
:
"supported_execution_providers": [ "^^", "case", [ ["^^", "get", "execution_device"],
"cpu", [ "CPUExecutionProvider" ],
[ "gpu", "cuda" ], [ "CUDAExecutionProvider" ],
[ "tensorrt","trt"], [ "TensorrtExecutionProvider"] ],
{"default_value": [ "CUDAExecutionProvider", "CPUExecutionProvider" ] }
],
Check this update for object_detection_using_onnxrt_py
:
maria@chai ~(master *=)$ axs byquery program_output,task=object_detection,framework=onnxrt,execution_device=cuda , get accuracy
...
Session execution provider: ['CUDAExecutionProvider', 'CPUExecutionProvider']
Device: GPU
inp.name=image , inp.shape=[1, 3, 1200, 1200] , inp.type=tensor(float)
output.name=bboxes , output.shape=[1, 'nbox', 4] , output.type=tensor(float)
output.name=labels , output.shape=[1, 'nbox'] , output.type=tensor(int64)
output.name=scores , output.shape=[1, 'nbox'] , output.type=tensor(float)
Data layout: NCHW Input layers: ['image']
Output layers: ['bboxes', 'labels', 'scores']
Input layer name: image
Expected input shape: [1, 3, 1200, 1200]
Expected input type: <class 'numpy.float32'>
Output layer names: ['bboxes', 'labels', 'scores']
Background/unlabelled classes to skip: 1
[batch 1 of 20] loading=300.20 ms, inference=4380.72 ms
...
Average Precision (AP) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.230
Average Precision (AP) @[ IoU=0.50 | area= all | maxDets=100 ] = 0.402
Average Precision (AP) @[ IoU=0.75 | area= all | maxDets=100 ] = 0.232
Average Precision (AP) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.155
Average Precision (AP) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.467
Average Precision (AP) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.316
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 1 ] = 0.247
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets= 10 ] = 0.333
Average Recall (AR) @[ IoU=0.50:0.95 | area= all | maxDets=100 ] = 0.347
Average Recall (AR) @[ IoU=0.50:0.95 | area= small | maxDets=100 ] = 0.209
Average Recall (AR) @[ IoU=0.50:0.95 | area=medium | maxDets=100 ] = 0.540
Average Recall (AR) @[ IoU=0.50:0.95 | area= large | maxDets=100 ] = 0.408
0.23015432623401041
supported_execution_providers
supporting to
in axs
:
image_classification_using_onnxrt_py
bert_using_onnxrt_py
in axs2mlperf
:
object_detection_using_onnxrt_loadgen
image_classification_using_onnxrt_loadgen
bert_using_onnxrt_loadgenimage_classification_using_onnxrt_py
cpu
maria@chai ~/axs/core_collection/workflows_collection/image_classification/image_classification_using_onnxrt_py (master *>)$ axs byquery loadgen_output,task=image_classification,framework=onnxrt,loadgen_scenario=Offline,loadgen_mode=AccuracyOnly,model_name=resnet50,loadgen_dataset_size=20,loadgen_buffer_size=100,execution_device=cpu , get accuracy
85.0
maria@chai ~/axs/core_collection/workflows_collection/image_classification/image_classification_using_onnxrt_py (master *>)$ axs byquery loadgen_output,task=image_classification,framework=onnxrt,loadgen_scenario=Offline,loadgen_mode=AccuracyOnly,model_name=resnet50,loadgen_dataset_size=20,loadgen_buffer_size=100,execution_device=gpu , get accuracy
...
Session execution provider: ['CUDAExecutionProvider', 'CPUExecutionProvider']
Device: GPU
load_query_samples([8, 1, 17, 12, 11, 10, 5, 15, 19, 9, 18, 6, 7, 2, 13, 14, 16, 4, 0, 3])
B20llllllllllllllllllll
Q20[batch of 1] inference=2151.41 ms
...
85.0
bert_using_onnxrt_py
cpu
maria@chai ~/axs/core_collection/workflows_collection/bert/bert_using_onnxrt_py (master *>)$ axs byquery loadgen_output,task=bert,framework=onnxrt,loadgen_scenario=Offline,loadgen_mode=AccuracyOnly,execution_device=cpu , get accuracy_dict
...
Session execution provider: ['CPUExecutionProvider']
Device: CPU
...
{'exact_match': 85.0, 'f1': 85.0}
gpu
maria@chai ~(master *>)$ axs byquery loadgen_output,task=bert,framework=onnxrt,loadgen_scenario=Offline,loadgen_mode=AccuracyOnly,execution_device=gpu , get accuracy_dict
...
Session execution provider: ['CUDAExecutionProvider', 'CPUExecutionProvider']
Device: GPU
...
{'exact_match': 85.0, 'f1': 85.0}
object_detection_using_onnxrt_loadgen
cpu
maria@chai ~/work_collection/axs2mlperf/object_detection_using_onnxrt_loadgen (master *=)$ axs byquery loadgen_output,task=object_detection,framework=onnxrt,loadgen_scenario=Offline,loadgen_mode=AccuracyOnly,model_name=ssd_resnet34,loadgen_dataset_size=20,loadgen_buffer_size=100,execution_device=cpu , get accuracy
...
Session execution provider: ['CPUExecutionProvider']
Device: CPU
...
22.852
gpu
maria@chai ~ (master *=)$ axs byquery loadgen_output,task=object_detection,framework=onnxrt,loadgen_scenario=Offline,loadgen_mode=AccuracyOnly,model_name=ssd_resnet34,loadgen_dataset_size=20,loadgen_buffer_size=100,execution_device=gpu , get accuracy
...
Session execution provider: ['CUDAExecutionProvider', 'CPUExecutionProvider']
Device: GPU
...
22.853
image_classification_using_onnxrt_loadgen
cpu
maria@chai ~ (master *>)$ axs byquery loadgen_output,task=image_classification,framework=onnxrt,loadgen_scenario=Offline,loadgen_mode=AccuracyOnly,model_name=resnet50,loadgen_dataset_size=20,loadgen_buffer_size=100,execution_device=cpu , get accuracy
...
Session execution provider: ['CPUExecutionProvider']
Device: CPU
...
85.0
gpu
maria@chai ~/work_collection/axs2mlperf/image_classification_using_onnxrt_loadgen (master *>)$ axs byquery loadgen_output,task=image_classification,framework=onnxrt,loadgen_scenario=Offline,loadgen_mode=AccuracyOnly,model_name=resnet50,loadgen_dataset_size=20,loadgen_buffer_size=100,execution_device=gpu , get accuracy
...
Session execution provider: ['CPUExecutionProvider', 'CUDAExecutionProvider']
Device: GPU
...
85.0
bert_using_onnxrt_loadgen
cpu
maria@chai ~/work_collection/axs2mlperf/bert_using_onnxrt_loadgen (master *>)$ axs byquery loadgen_output,task=bert,framework=onnxrt,loadgen_scenario=Offline,loadgen_mode=AccuracyOnly,execution_device=cpu , get accuracy
...
Session execution provider: ['CPUExecutionProvider']
Device: CPU
...
85.0
gpu
maria@chai ~/work_collection/axs2mlperf/bert_using_onnxrt_loadgen (master *>)$ axs byquery loadgen_output,task=bert,framework=onnxrt,loadgen_scenario=Offline,loadgen_mode=AccuracyOnly,execution_device=gpu , get accuracy
...
Session execution provider: ['CUDAExecutionProvider', 'CPUExecutionProvider']
Device: GPU
...
85.0
Status: Done.
Create new base class
nvidia_gpu_support
.Move
num_gpu
to it and remove it from. object_detection_onnx_loadgen_py onnx_object_detector_sparse onnx_object_detector onnx_image_classifier pytorch_image_classifier image_classification_torch_loadgen_py image_classification_onnx_loadgen_py bert_squad_onnxruntime_py bert_demo_torch_py bert_squad_onnxruntime_loadgen_pyMove
onnxruntime_name
,torchvision_query
,torch_query
to this base class and remove it from object_detection_onnx_loadgen_py onnx_object_detector onnx_image_classifier pytorch_image_classifier image_classification_torch_loadgen_py image_classification_onnx_loadgen_py bert_squad_onnxruntime_py(torch_query) bert_squad_onnxruntime_py(onnxruntime_name) bert_demo_torch_py bert_squad_onnxruntime_loadgen_py