migraphx-benchmark / AMDMIGraphX

AMD's graph optimization engine.
https://rocmsoftwareplatform.github.io/AMDMIGraphX/doc/html/
MIT License
0 stars 1 forks source link

Common API - Network Definition API #188

Open music-dino opened 1 month ago

music-dino commented 1 month ago

Create an MGX API to mimic the TRT network definition APIs. Once the APIs are created, attempt to combine it with the TRT ONNX parser so that the network definition and layer manipulation APIs can be used on a network created by parsing an ONNX file.

music-dino commented 1 month ago

Input

Wraps an mgx::parameter

kACTIVATION

Activations which require non-default alpha and/or beta params, must set them after the layer is created via the appropriate method Type can be changed after the layer is created. ActivationTypes that map to a single MGX operator:

ActivationTypes that map to a combination of MGX operators:

Unsure:

Not part of ONNX:

kLRN

Maps to a single MGX operator - "lrn" WindowSize, Alpha, Beta, and K can be changed after the layer is created

kSCALE

Doesn't seem to have ONNX equivalent

kSOFTMAX

Axis must be set after the layer is created Maps to a single MGX operator - "softmax"

kCONCATENATION

Axis must be set after the layer is created Maps to a single MGX operator - "concat"

kELEMENTWISE

Operation can be changed after the layer is created ElementWiseOperations mapping to mgx operators:

ElementWiseOperations without mapping to mgx operators:

kUNARY

Operation can be changed after the layer is created UnaryOperations mapping to mgx operators:

kSHUFFLE

This layer shuffles data by applying in sequence: a transpose operation, a reshape operation and a second transpose operation. The dimension types of the output are those of the reshape dimension.

The layer has an optional second input. If present, it must be a 1D Int32 shape tensor, and the reshape dimensions are taken from it.

Maps to a combination of mgx operators (transpose and reshape).

kONEHOT

Axis can be changed after the layer is created Maps to a combination of mgx operators

kREDUCE

Operation, Axes, and KeepDimensions can be changed after the layer is created ReduceOperations mapping to mgx operators:

kTOPK

Operation, K, and reduceAxes can be changed after the layer is created TopKOperations kMAX and kMIN correspond to the two possible values of the largest attribute? setInput(1) is used when K is to be provided as an input. MGX does not support this

kGATHER

GatherMode must be set after the layer is created, unless kDEFAULT is desired GatherAxis and NbElementWiseDims can be changed after the layer is created Depending on GatherMode, the layer corresponds to different ONNX ops:

For mode kND the mgx "gathernd" operator can be used directly For mode kELEMENT the "gather" operator can be used in combination with other operators(NOTE: There seems to be a bug in parse_gather_elements.cpp:95, 0 is always passed for axis)

See above

kRAGGED_SOFTMAX

No equivalent ONNX or MGX operator

KMATRIX_MULTIPLY

Op0 and Op1 can be changed after the layer is created MatrixOperations kNONE and kTRANSPOSE can be supported, kVECTOR needs more investigation Can be implemented as GEMM with alpha and beta equal to 1

kNON_ZERO

Implemented via "nonzero" mgx op if input is not a constant, a literal is inserted otherwise

kCONSTANT

Wraps an mgx literal Dimensions and weights can be changed after layer is created

kIDENTITY

Directly maps to mgx id operator

kCAST

toType can be changed after the layer is created Maps to mgx "convert" operator

kPARAMETRIC_RELU

Maps to mgx "prelu" operator

kCONVOLUTION

Input, nbOutputMaps, kernelSize, kernelWeights, and biasWeights can be changed after the layer is created Dilations, nbGroups, PaddingMode, PaddingNd, PostPadding, PrePadding, StrideNd all must be set after the layer is created if non-default values are to be used If setPaddingMode and setPrePadding/setPostPadding are both used, PaddingMode takes precedence The supported TRT padding modes are:

PrePadding and PostPadding amount to the same things as the pads attribute in ONNX. The difference being that ONNX prohibits pads being used when auto_pad is used, while TRT gives precedence to auto_pad/PaddingMode if both are used.

kPOOLING

Corresponds to ONNX MaxPool, AveragePool, GlobalMaxPool and GlobalAveragePool For global pooling a window size equal to the input size should be used PoolingType kMAX_AVERAGE_BLEND does not correspond to an ONNX operator LpPooling is not supported PoolingType, and window size can be changed after the layer is created AverageCountExcludesPadding, PrePadding, PostPadding, PaddingMode, and Stride must be set after the layer is created if non-default values are to be used For padding information see addConvolutionNd

kDECONVOLUTION

Corresponds to ONNX ConvTranspose operator Setting the output dimensions is not supported in TRT For other detail see addConvolutionNd

kSCALE

TODO

kRESIZE

Implemented as a combination of mgx operators OutputDimensions, Scales, ResizeMode, CoordinateTransformation, SelectorForSinglePixel, NearestRounding, CubicCoeff, ExcludeOutside must be set after the layer is created if non-default values are to be used InterpolationMode corresponds to ONNX mode, all are supported ResizeCoordinateTransformation corresponds to ONXX coordinate_transformation_mode, only half_pixel, asymmetric and align_corner are supported ResizeRoundMode corresponds to ONNX nearest_mode, all are supported Antialising is not supported

kLOOP

TODO

kCONDITION

TODO

kSELECT

Equivalent to the "where" mgx/ONNX op

kASSERTION

No equivalent ONNX or MGX operator

kFILL

Covers the RandomNormal and RandomNormalLike ONNX operators for KillOperation::kRANDOM_NORMAL, and RandomUniform and RandomUniformLike for FillOperation::kRANDOM_UNIFORM. Value of the seed attribute, if present, is ignored In MGX a literal with values from the appropriate distribution is created Operation and outputType can be changed after the layer is created Alpha and beta values must be changed after the layer is created if non-default values are to be used Alpha and beta have different meaning depending on FillOperation:

kPADDING

Pre and post padding can be changed after the layer is created Only does zero padding

kDEQUANTIZE

TODO

kSCATTER

Covers the ONNX ScatterElements and ScatterND operators for ScatterMode::kElement and ScatterMode::kND respectively Reduction is not supported Mode can be changed after the layer is created Axis must be set after the layer is created if non default value is to be used

...

music-dino commented 1 month ago

The initial approach will be to wrap both layers and tensors around instructions. Layers that are implemented by more than one mgx operator will hold references to the starting and ending instructions of the block of instructions that it's composed of, while the output tensor(s) will hold references to only the final instruction. When a layer is modified after creation by use of a setter method, the existing mgx IR will be modified by use of replace_instruction.

This approach allow us to use existing infrastructure for both instruction modification, which needs to be reflected in downstream instructions if the output type changes, and for creating composite operators, by reusing existing parsing code.

A difficulty that presents itself for any approach and this one especially are ONNX operators that hold subgraphs. An attempt will be made to validate if this approach can work for them by implementing the ILoopLayer. A good idea for how this layer works can be found in the trt onnx parser source code for parsing an ONNX Loop: https://github.com/onnx/onnx-tensorrt/blob/7ecb49a435bd881b9ac4011450315192885e5cc3/onnxOpImporters.cpp#L2810

music-dino commented 2 weeks ago

The way INetworkDefinition tracks input and output tensors needs to be modified. MGX does not guarantee that output parameters will be inserted after input parameters, nor seemingly the relative ordering of the parameters.