intel / intel-extension-for-pytorch

A Python package for extending the official PyTorch that can easily obtain performance on Intel platform
Apache License 2.0
1.54k stars 234 forks source link

Is AMX supported for LLM inference? #517

Closed Hyungyo1 closed 19 hours ago

Hyungyo1 commented 7 months ago

Describe the issue

Hi, I have a quick question regarding the LLM inference on CPUs using this extension. I've been digging into the LLM inference case, and it seems like the kernels written in C++ do not run on AMX (AVX512 is the only one I see). For example, _IPEXlinearReluCPU calls the torch.ops.torch_ipex.tpp_linear_relu C++ code which doesn't seem to be running on AMX. Is there any LLM layer that runs on AMX, and if so, which C++ code implements it?

Thank you.

jingxu10 commented 7 months ago

if you use data type bfloat16 or int8, pytorch and ipex will use AMX.

Hyungyo1 commented 7 months ago

Thanks for your response. If it's possible, could you please point out which C++ kernel code implements GEMM on AMX?

jgong5 commented 7 months ago

Please note that you will need the 4-th generation xeon or beyond to take advantage of AMX. The tpp kernel you refer to would invoke the micro-kernels in the libxsmm which would leverage AMX on the CPU platforms having the AMX HW support. See BrgemmTPP at https://github.com/intel/intel-extension-for-pytorch/blob/46c870e83c277c0c29d8f0d3b26c17f62ffbfe1e/csrc/cpu/tpp/kernels/TPPGEMMKrnl.h#L137 and its implementation here: https://github.com/intel/intel-extension-for-pytorch/blob/46c870e83c277c0c29d8f0d3b26c17f62ffbfe1e/csrc/cpu/tpp/xsmm_functors.h#L1835 which calls into libxsmm.

hezhiqian01 commented 4 days ago

@jgong5 IPEX calls the oneDNN kernels, doesn't it?

jgong5 commented 3 days ago

@jgong5 IPEX calls the oneDNN kernels, doesn't it?

Not always. We have multiple choices of kernels for GEMMs, some implemented with oneDNN and others implemented with TPP/intrinsics kernels.

NeoZhangJianyu commented 1 day ago

@Hyungyo1 Could you feedback? If no more questions, we will close this issue.

Hyungyo1 commented 19 hours ago

@NeoZhangJianyu Yes, my question is answered. Thank you.