Many rocm libraries, including miopen and rocblas, support modes where kernel efficiency is calculated and displayed. This enables easy identification of low hanging fruits for performance optimization. For example, kernels that have efficiency less than 50% should be investigated. The less the kernel efficiency, the higher the likelihood that the kernel needs investigation/improvement.
The feature request in this issue is to support such an introspection mode in migraph as well.
Further required element is an idea of how much data is processed by the kernel. This can be deduced either "by hand", given the problem size (miopen does this, for example), or, for extra credit, from GPU profiling. In either case, a measure of kernel efficiency can be produced and reported.
Many rocm libraries, including miopen and rocblas, support modes where kernel efficiency is calculated and displayed. This enables easy identification of low hanging fruits for performance optimization. For example, kernels that have efficiency less than 50% should be investigated. The less the kernel efficiency, the higher the likelihood that the kernel needs investigation/improvement.
The feature request in this issue is to support such an introspection mode in migraph as well.
A foundational element of this is in issue: https://github.com/ROCmSoftwarePlatform/AMDMIGraphX/issues/844 which will allow an objective measurement of gpu time.
Further required element is an idea of how much data is processed by the kernel. This can be deduced either "by hand", given the problem size (miopen does this, for example), or, for extra credit, from GPU profiling. In either case, a measure of kernel efficiency can be produced and reported.