The matrix multiplication primitives are essentially the same as in BLIS; you can find lots of performance graphs for BLIS here. It is typically as fast or faster than OpenBLAS and about 10% slower than MKL. Of course, TBLIS's forte is tensor operations, which are not natively available in MKL and are much slower than TBLIS when implemented using tensor transpose+matrix multiplication (see here).
The matrix multiplication primitives are essentially the same as in BLIS; you can find lots of performance graphs for BLIS here. It is typically as fast or faster than OpenBLAS and about 10% slower than MKL. Of course, TBLIS's forte is tensor operations, which are not natively available in MKL and are much slower than TBLIS when implemented using tensor transpose+matrix multiplication (see here).