Open XapaJIaMnu opened 4 years ago
How do you test DNNL performance? Could you put the HW/SW configuration and DNNL verbose log?
@pengzhao-intel we used this: https://github.com/XapaJIaMnu/gemmBench/
Thanks for the information @XapaJIaMnu
For DNNL, you can use benchdnn
in the Intel repo and the 2nd generation of Intel Xeon is the preferred platform for testing INT8, like AWS EC2 c5.18xlarge, c5.24xlarge.
https://github.com/intel/mkl-dnn/tree/master/tests/benchdnn
@pengzhao-intel to give you some background about this particular benchmark: The matrix sizes chosen are the matrix sizes that constitute the biggest computational cost for our machine translation models. We also aim to run on variety of outdated consumer grade hardware still in use. This is why we have benchmarks on architectures spanning from SSSE3 to VNNI, not just recent xeons.
Is the lossless-ness of Intgemm Shifted
similar to fbgemm Packed
?
All intgemm operations are packed (though the formats are not necessarily the same). Shifted refers to adding a constant to work around Intel's unsigned * signed instruction.
On ssse3 (tested on the mac)
On AVX2 (Tested on my laptop)
On AVX512VNNI
Marian uses
fbgemm Packed
, which doesunsigned
Xsigned
and unquantizes to floats after. We should aim for those numbers. For comparison, use https://github.com/XapaJIaMnu/gemmbench