Open zhczhong opened 2 weeks ago
According to the perf test from @yifeizh2, the read lock will bring 10-20% performance regression https://github.com/intel/graph-compiler/blob/bc0014b12bb58b554277ab74c248535fff8c0cec/lib/gc/ExecutionEngine/CPURuntime/Microkernel/BrgemmOnednn.cpp#L145
<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">
According to the perf test from @yifeizh2, the read lock will bring 10-20% performance regression https://github.com/intel/graph-compiler/blob/bc0014b12bb58b554277ab74c248535fff8c0cec/lib/gc/ExecutionEngine/CPURuntime/Microkernel/BrgemmOnednn.cpp#L145
<html xmlns:v="urn:schemas-microsoft-com:vml" xmlns:o="urn:schemas-microsoft-com:office:office" xmlns:x="urn:schemas-microsoft-com:office:excel" xmlns="http://www.w3.org/TR/REC-html40">
data type| bs | hidden_list | without lock | with lock | -- | -- | -- | -- | -- | -- bf16 | 128 | 16x512 | 0.021597 | 0.01967 | 109.79% bf16 | 128 | 512x256 | 0.025172 | 0.029239 | 86.09% bf16 | 128 | 256x128 | 0.015969 | 0.026034 | 61.34% bf16 | 128 | 512x1024 | 0.046416 | 0.055459 | 83.69% bf16 | 128 | 1024x1024 | 0.082655 | 0.089712 | 92.13% bf16 | 128 | 1024x512 | 0.066204 | 0.066523 | 99.52% bf16 | 128 | 512x256 | 0.025992 | 0.029516 | 88.06%