Closed moderato closed 5 months ago
Hi there, I saw there is some code in the sandbox/power10 folder for BF16 GEMM. I suppose that is just for POWER10 machines? Is it possible to build and run code with
bli_sbgemm
on AMD CPU? Thanks!
Thanks for your question. Yes, that code is specific to POWER10 systems. The author (@nicholaiTukanov) likely did not intend for it to run on AMD CPUs. That said, we always encourage power users (pun not intended) to tinker around and see what you can get working!
Hi there, I saw there is some code in the sandbox/power10 folder for BF16 GEMM. I suppose that is just for POWER10 machines? Is it possible to build and run code with
bli_sbgemm
on AMD CPU? Thanks!Thanks for your question. Yes, that code is specific to POWER10 systems. The author (@nicholaiTukanov) likely did not intend for it to run on AMD CPUs. That said, we always encourage power users (pun not intended) to tinker around and see what you can get working!
Thanks for the reply. Does that mean there is no BF16 support for AMD CPUs for now?
Hi Zhongyi Lin,
You can use BF16 implementation designed for zen4 and above, which are available in aocl_gemm addon in amd/blis https://github.com/amd/blis/tree/master/addon/aocl_gemm
You can build clone amd version of blis and build with aocl_gemm addon and call one of the below api's which has similar arguments, one can pass null for post-ops structure argument if intended to use only for gemm. API definitions available in this file https://github.com/amd/blis/blob/master/addon/aocl_gemm/aocl_gemm_interface_apis.h
aocl_gemm_bf16bf16f32of32( ) - This API accumulates at float (f32) precision and gives the output in float (f32) aocl_gemm_bf16bf16f32obf16( ) - This API accumulates at float (f32) precision and gives the output in bf16 format (which is half the size)
Bhaskar
Hi Zhongyi Lin,
You can use BF16 implementation designed for zen4 and above, which are available in aocl_gemm addon in amd/blis https://github.com/amd/blis/tree/master/addon/aocl_gemm
You can build clone amd version of blis and build with aocl_gemm addon and call one of the below api's which has similar arguments, one can pass null for post-ops structure argument if intended to use only for gemm. API definitions available in this file https://github.com/amd/blis/blob/master/addon/aocl_gemm/aocl_gemm_interface_apis.h
aocl_gemm_bf16bf16f32of32( ) - This API accumulates at float (f32) precision and gives the output in float (f32) aocl_gemm_bf16bf16f32obf16( ) - This API accumulates at float (f32) precision and gives the output in bf16 format (which is half the size)
Bhaskar
Hi Bhaskar, thank you for this valuable information. Will try and let you know.
Hi there, I saw there is some code in the sandbox/power10 folder for BF16 GEMM. I suppose that is just for POWER10 machines? Is it possible to build and run code with
bli_sbgemm
on AMD CPU? Thanks!