intel / xFasterTransformer

Apache License 2.0
270 stars 53 forks source link

[Kernel] Make SelfAttention prepared for AMX_FP16; More balanced task split in Cross Attention #466

Closed pujiang2018 closed 1 day ago

abenmao commented 4 days ago

LGTM. Found the fp16 gemm kernel is not ready, so the results have not been verified yet.

pujiang2018 commented 1 day ago

LGTM. Found the fp16 gemm kernel is not ready, so the results have not been verified yet.

need to enable it in next PR with new xDNN.