facebookincubator / dynolog

Dynolog is a telemetry daemon for performance monitoring and tracing. It exports metrics from different components in the system like the linux kernel, CPU, disks, Intel PT, GPUs etc. Dynolog also integrates with pytorch and can trigger traces for distributed training applications.
MIT License
188 stars 34 forks source link

revise retired SSE/AVX flops events def for AMD Zen4 #216

Closed Alston-Tang closed 5 months ago

Alston-Tang commented 5 months ago

Summary: sse/avx flops event config in linux perf tool is different from the one defined in hbt perf tool fp_ret_sse_avx_ops.all uses umask 0x1f, while hbt uses umask 0x0f

according to AMD manual:

{F1325336882}

bit 4 is used to determine if bfloat mac should be counted as 2 ops. this should be true to provide consistent behavior

so this diff make Zen3 and Zen4 machines use different event to monitor SSE/AVX FLOPs

Differential Revision: D52861397

facebook-github-bot commented 5 months ago

This pull request was exported from Phabricator. Differential Revision: D52861397

facebook-github-bot commented 5 months ago

This pull request was exported from Phabricator. Differential Revision: D52861397

facebook-github-bot commented 5 months ago

This pull request has been merged in facebookincubator/dynolog@bfdae99f81b71cad8727ead9cd670745da8d42ca.