Enable runtime gpu_arch auto-select based on devices where kernels are executing for gemm_int4 tests; enable device-specific compilation using USE_XETLA (xe_lpg, xe_hpg, xe_hpc). #302
Change #1: Enable runtime gpu_arch auto-select based on devices where kernels are executing for gemm_int4 tests.
Change #2: enable device-specific compilation using USE_XETLA (xe_lpg, xe_hpg, xe_hpc) to address the current messy issue of "tests not matching device type"
Description
template template class to wrap <gpu_arch, mma_engine> for runtime gpu_arch auto-select based on devices where kernels are executing.
USE_XETLA options for compilation on different devices (xe_lpg, xe_hpg, xe_hpc).
Expected Behavior & Potential Risk
No foreseeable risk related to CMake Compliation / code execution.
Type of Change
Change #1: Enable runtime gpu_arch auto-select based on devices where kernels are executing for gemm_int4 tests. Change #2: enable device-specific compilation using USE_XETLA (xe_lpg, xe_hpg, xe_hpc) to address the current messy issue of "tests not matching device type"
Description
template template class to wrap <gpu_arch, mma_engine> for runtime gpu_arch auto-select based on devices where kernels are executing.
USE_XETLA options for compilation on different devices (xe_lpg, xe_hpg, xe_hpc).
Expected Behavior & Potential Risk
No foreseeable risk related to CMake Compliation / code execution.
How has this PR been tested?
tested on mtl/dg2
Dependency Change?
No Libraries changed.