Closed ikawrakow closed 2 days ago
It is slightly slower than fp16, but definitely a massive improvement compared to not having bf16 support at al. ~Didn't put any effort into optimizing the matrix x vector kernel, so it is likely one can improve bf16 TG performance~.
fp16
bf16
It is slightly slower than
fp16
, but definitely a massive improvement compared to not havingbf16
support at al. ~Didn't put any effort into optimizing the matrix x vector kernel, so it is likely one can improvebf16
TG performance~.