bit-bots / YOEO

YouOnlyEncodeOnce - A CNN for Embedded Object Detection and Semantic Segmentation
GNU General Public License v3.0
22 stars 4 forks source link

Bump onnxruntime-gpu from 1.8.1 to 1.9.0 #11

Closed dependabot[bot] closed 2 years ago

dependabot[bot] commented 2 years ago

Bumps onnxruntime-gpu from 1.8.1 to 1.9.0.

Release notes

Sourced from onnxruntime-gpu's releases.

ONNX Runtime v1.9.0

Announcements

  • GCC version < 7 is no longer supported
  • CMAKE_SYSTEM_PROCESSOR needs be set when cross-compiling on Linux because pytorch cpuinfo was introduced as a dependency for ARM big.LITTLE support. Set it to the value of uname -m output of your target device.

General

  • ONNX 1.10 support
    • opset 15
    • ONNX IR 8 (SparseTensor type, model local functionprotos, Optional type not yet fully supported this release)
  • Improved documentation of C/C++ APIs
  • IBM Power support
  • WinML - DLL dependency fix supports learning models on Windows 8.1
  • Support for sub-building onnxruntime-extensions and statically linking into onnxruntime binary for custom builds
    • Add --_use_extensions option to run models with custom operators implemented in onnxruntime-extensions

APIs

  • Registration of a custom allocator for sharing between multiple sessions. (See RegisterAllocator and UnregisterAllocator APIs in onnxruntime_c_api.h)
  • SessionOptionsAppendExecutionProvider_TensorRT API is deprecated; use SessionOptionsAppendExecutionProvider_TensorRT_V2
  • New APIs: SessionOptionsAppendExecutionProvider_TensorRT_V2, CreateTensorRTProviderOptions, UpdateTensorRTProviderOptions, GetTensorRTProviderOptionsAsString, ReleaseTensorRTProviderOptions, EnableOrtCustomOps, RegisterAllocator, UnregisterAllocator, IsSparseTensor, CreateSparseTensorAsOrtValue, FillSparseTensorCoo, FillSparseTensorCsr, FillSparseTensorBlockSparse, CreateSparseTensorWithValuesAsOrtValue, UseCooIndices, UseCsrIndices, UseBlockSparseIndices, GetSparseTensorFormat, GetSparseTensorValuesTypeAndShape, GetSparseTensorValues, GetSparseTensorIndicesTypeShape, GetSparseTensorIndices,

Performance and quantization

  • Performance improvement on ARM
    • Added S8S8 (signed int8, signed int8) matmul kernel. This avoids extending uin8 to int16 for better performance on ARM64 without dot-product instruction
    • Expanded GEMM udot kernel to 8x8 accumulator
    • Added sgemm and qgemm optimized kernels for ARM64EC
  • Operator improvements
    • Improved performance for quantized operators: DynamicQuantizeLSTM, QLinearAvgPool
    • Added new quantized operator QGemm for quantizing Gemm directly
    • Fused HardSigmoid and Conv
  • Quantization tool - subgraph support
  • Transformers tool improvements
    • Fused Attention for BART encoder and Megatron GPT-2
    • Integrated mixed precision ONNX conversion and parity test for GPT-2
    • Updated graph fusion for embed layer normalization for BERT
    • Improved symbolic shape inference for operators: Attention, EmbedLayerNormalization, Einsum and Reciprocal

Packages

  • Official ORT GPU packages (except Python) now include both CUDA and TensorRT Execution Providers.
    • Python packages will be updated next release. Please note that EPs should be explicitly registered to ensure the correct provider is used.
  • GPU packages are built with CUDA 11.4 and should be compatible with 11.x on systems with the minimum required driver version. See: CUDA minor version compatibility
  • Pypi
    • ORT + DirectML Python packages now available: onnxruntime-directml
    • GPU package can be used on both CPU-only and GPU machines
  • Nuget
    • C#: Added support for using netstandard2.0 as a target framework
    • Windows symbol (PDB) files are now contained in a separate Nuget package, reduces size of the binary Nuget package by 85%

Execution Providers

  • CUDA EP

... (truncated)

Commits


Dependabot compatibility score

Dependabot will resolve any conflicts with this PR as long as you don't alter it yourself. You can also trigger a rebase manually by commenting @dependabot rebase.


Dependabot commands and options
You can trigger Dependabot actions by commenting on this PR: - `@dependabot rebase` will rebase this PR - `@dependabot recreate` will recreate this PR, overwriting any edits that have been made to it - `@dependabot merge` will merge this PR after your CI passes on it - `@dependabot squash and merge` will squash and merge this PR after your CI passes on it - `@dependabot cancel merge` will cancel a previously requested merge and block automerging - `@dependabot reopen` will reopen this PR if it is closed - `@dependabot close` will close this PR and stop Dependabot recreating it. You can achieve the same result by closing it manually - `@dependabot ignore this major version` will close this PR and stop Dependabot creating any more for this major version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this minor version` will close this PR and stop Dependabot creating any more for this minor version (unless you reopen the PR or upgrade to it yourself) - `@dependabot ignore this dependency` will close this PR and stop Dependabot creating any more for this dependency (unless you reopen the PR or upgrade to it yourself)
Flova commented 2 years ago

@dependabot rebase