RadeonML: AMD Inference Library
1. C/C++ API
2. Code examples
2.1. List of supported models for load_model sample
- vgg16.onnx, vgg19.onnx
- Supported backends: DirectML
- Input format: NCHW (Nx3x224x224)
- vgg16_opset11.onnx, vgg19_opset11.onnx
- Supported backends: DirectML
- Input format: NHWC (Nx224x224x3)
- mobilenet_v1_1.0_224.onnx, mobilenet_v2_1.0_224.onnx
- Supported backends: DirectML
- Input format: NCHW (Nx3x224x224)
- mobilenet_v1_opset11.onnx, mobilenet_v2_opset11.onnx
- Supported backends: DirectML
- Input format: NHWC (Nx224x224x3)
- resnet18v1.onnx, resnet34v1.onnx, resnet50v1.onnx, resnet101v1.onnx, resnet152v1.onnx
- resnet18v2.onnx, resnet34v2.onnx, resnet50v2.onnx, resnet101v2.onnx, resnet152v2.onnx
- Supported backends: DirectML
- Input format: NCHW (Nx3x224x224)
- inception_v1_opset8.onnx, inception_v2_opset8.onnx
- Supported backends: DirectML
- Input format: NCHW (Nx3x224x224)
- inception_v1_opset11.onnx, inception_v2_opset11.onnx
- Supported backends: DirectML
- Input format: NHWC (Nx224x224x3)
- inception_v3_opset11.onnx, inception_v4_opset11.onnx
- Supported backends: DirectML
- Input format: NHWC (Nx299x299x3)
- denoise_c3_ldr.onnx, denoise_c3_hdr.onnx
- Supported backends: DirectML, MIOpen, MPS
- Input format: NHWC (NxHxWx3)
- denoise_c9_ldr.onnx, denoise_c9_hdr.onnx
- Supported backends: DirectML, MIOpen, MPS
- Input format: NHWC (NxHxWx9)
- upscale2x_c3_rt.onnx, upscale2x_c3_esrgan_small.onnx
- Supported backends: DirectML, MIOpen, MPS
- Input format: NHWC
To inference supported models just substitute 'path/model', 'path/input' and 'path/output' with correct paths in load_model sample.
Additional information about supported models: https://github.com/onnx/models
3. System requirements
- Windows 10 19H1 or later (for DirectML backend)
- Windows 7 uses the MIOpen backend
- Ubuntu 18.04
- CentOS/RHEL 7.6, 7.7
- OSX Mojave and Catalina
3.1 Features supported
- ONNX support (opset 6-11)
- TensforFlow frozen graph pb files
- FP32 and FP16 for ONNX
For more information, see documentation at this link
https://radeon-pro.github.io/RadeonProRenderDocs/rml/about.html
3.2 Features supported by OS
-
Windows DirectML supports our denoisers, upscalers and common models like resnet, VGG etc..
-
Miopen backend for Windows and Linux only supports our denoisers and upscalers. When creating a RML context if DML is not supported we will fallback automatically to MIOpen
-
MPS backend only supports our denoisers
-
Model supported by the different backend
|
DIRECTML |
MIOPEN |
MPS |
Inception V1 |
Yes |
Yes |
No |
Inception V2 |
Yes |
Yes |
No |
Inception V3 |
Yes |
Yes |
No |
Inception V4 |
Yes |
Yes |
No |
MobileNet V1 |
Yes |
Yes |
No |
MobileNet V2 |
Yes |
Yes |
No |
ResNet V1 50 |
Yes |
No |
No |
ResNet V2 50 |
Yes |
No |
No |
VGG 16 |
Yes |
No |
No |
VGG 19 |
Yes |
No |
No |
UNet(denoiser) |
Yes |
Yes |
Yes |
ESRGAN |
Yes |
Yes |
Yes |
RTUnet |
Yes |
Yes |
Yes |
Others models may work as they will have similar operators, but we haven't checked them
3.3 DirectML and Directx12 interop
- A device and command queue can be passed when creating a RML context. We support both async compute queue(D3D12_COMMAND_LIST_TYPE_COMPUTE) and the default command queue(D3D12_COMMAND_LIST_TYPE_DIRECT).
Compute queue are preferred as it will run asynchronously with any graphics job.
If no queue are passed, RML will create a compute queue
4. Building and running the samples
You will need CMake 3.10 at least to build the sample.
The input must contain contiguous data of a tensor with specified dimensions.
The input (.bin files) in the repo don't necessarily represent real data at the moment, but just show to format the data
5. Future
We aim at providing the same level of feature for every back end and will provide updates monthly for that