Closed Alston-Tang closed 5 months ago
This pull request was exported from Phabricator. Differential Revision: D52861397
This pull request was exported from Phabricator. Differential Revision: D52861397
This pull request has been merged in facebookincubator/dynolog@bfdae99f81b71cad8727ead9cd670745da8d42ca.
Summary: sse/avx flops event config in linux perf tool is different from the one defined in hbt perf tool fp_ret_sse_avx_ops.all uses umask 0x1f, while hbt uses umask 0x0f
according to AMD manual:
{F1325336882}
bit 4 is used to determine if bfloat mac should be counted as 2 ops. this should be true to provide consistent behavior
so this diff make Zen3 and Zen4 machines use different event to monitor SSE/AVX FLOPs
Differential Revision: D52861397