Bump onnxruntime from 1.2.0 to 1.5.1

Bumps onnxruntime from 1.2.0 to 1.5.1.

Release notes

ONNX Runtime v1.5.1

Key Updates

General

Reduced Operator Kernel build allows ORT binaries to be built with only required operators in the model(s) - learn more

[Preview] ORT for Mobile Platforms - minimizes build size for mobile and embedded devices - learn more

Transformer model inferencing performance optimizations

Perf improvement for DistilBERT

Benchmark tool supports more pretrained models

Improvements in quantization tool

Support quantization-aware training models

Make calibration tool to support general preprocessing and calibrate on input

Simplify the quantization APIs

Support of model larger than 2G

New operators for static quantization: QLinearMul, QLinearAdd, QlinearSigmoid and QLinearLeakyRelu

Prepack constant matrix B for float GEMM (MatMul, Attention)

Limited Python 3.8 support added in addition to 3.5-3.7 for official Python packages. Not yet supported for Windows GPU and Linux ARM builds.

Telemetry enabled in Java and NodeJS packages for Windows builds. Note: data is not directly sent to Microsoft or ORT teams by ONNX Runtime; enabling telemetry means trace events are collected by the Windows operating system and may be sent to the cloud based on the user's privacy settings - learn more.

API

Python API support for RegisterCustomOpsLibrary

IO Binding API for C/C++/C# language bindings. This allows use of pre-allocated buffers on targeted devices and also target device for unknown output shapes.

Sharing of allocators between multiple sessions. This allows much better utilization of memory by not creating a separate arena for each session in the same process. See this for details.

Windows ML

NuGet package now supports UWP applications targeting Windows Store deployment (CPU only)

NuGet package now supports .NET and .NET framework applications

RUST Developers can now deploy Windows ML – sample and documentation available here

New APIs to for additional performance control:

IntraopNumThreads: Provides an ability to change the number of threads used in the threadpool for Intra Operator Execution for CPU operators through LearningModelSessionOptions.

SetNamedDimensionOverrides: Provides the ability to override named input dimensions to concrete values through LearningModelSessionOptions in order to achieve better runtime performance.

Support for additional ONNX format image type denotations – Gray8, normalized [0..1] and normalized [-1..1]

Reduced Windows ML package size by separating debug symbols into separate distribution package.

Execution Providers

CUDA updates

CUDA 10.2 / cuDNN 8.0 in official package

CUDA 11 support added and available to build from source

CUDA conv kernel support asymmetrical padding to fully support models such as YoloV3 for improved GPU perf

TensorRT EP updates

Support for TensorRT 7.1

Added TensorRT engine caching feature, turned on by setting env variable ORT_TENSORRT_ENGINE_CACHE_ENABLE=1

TensorRT builds are now built with the Execution Provider as a separate dll. If enabled in the build, the provider will be available as a shared library. This was previously also enabled for the DNNL EP (ORT 1.3). Other Execution Providers will be added in the future.

OpenVINO EP updates

Support for OpenVINO 2020.4

Added runtime options for VPU hardware to select specific hardware device and enable fast compilation of models.

Enable C# binding support for OpenVINO EP

DirectML EP updates

API available for Python (build from source) and C#Microsoft.ML.OnnxRuntime.DirectML

7 new operators for ONNX 1.7 (opset 12): Celu, GreaterOrEqual, LessOrEqual, ArgMin/Max with select_last_index, GatherND with batch_dim, RoiAlign

New data integer types were added to existing operators: Clip int, Max int, Min int, MaxPool int8, ReduceMin int8, ReduceMax int8, Pow int exponent

Commits

5de47af fix quantization of EmbeddingLayerNorm (#5321)
c00e13a Cherry pick (batch 2) to rel-1.5.1 (#5290)
389cca7 Handle missing initializers in allocation planner to fix crashes with DML pro...
b648fe5 ORT DirectML EP for Iron release, ONNX 1.5 (part 2) (#5263)
eb75b49 Fix bug in the back to back quantization of matmul and conv (#5264)
47447da bump version to 1.5.1 (#5258)
87b15f3 Fix reshape fusion crash (#5252)
fc259de Fix possible ios build break after update to Xcode 12 (#5246)
9fd76c8 Place Shape's output in CPU memory (#5245)
9158679 Update BUILD.md training dependency info. (#5240)
Additional commits viewable in compare view

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.

Dependabot commands and options

You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself) - `@dependabot use these labels` will set the current labels as the default for future PRs for this repo and language - `@dependabot use these reviewers` will set the current reviewers as the default for future PRs for this repo and language - `@dependabot use these assignees` will set the current assignees as the default for future PRs for this repo and language - `@dependabot use this milestone` will set the current milestone as the default for future PRs for this repo and language - `@dependabot badge me` will comment on this PR with code to add a "Dependabot enabled" badge to your readme Additionally, you can set the following in your Dependabot [dashboard](https://app.dependabot.com): - Update frequency (including time of day and day of week) - Pull request limits (per update run and/or open at any time) - Out-of-range updates (receive only lockfile updates, if desired) - Security updates (receive only security updates, if desired)

QDucasse / nn_benchmark

Bump onnxruntime from 1.2.0 to 1.5.1 #6

ONNX Runtime v1.5.1

Key Updates

General

API

Windows ML

Execution Providers