Open undertherain opened 3 years ago
confirmed on Epyc
ython3 -m benchmarker --framework=pytorch --problem=bert_custom --problem_size=32,128 --cnt_units=768 --batch_size=32 --cnt_heads=12 --cnt_layers=12 --mode=inference --flops
"gflop_estimated": 7151.120547840001,
"gflop_measured": 0.000158806,
"len_sequence": 128,
IDX : 933232654 PMU name : amd64_fam17h_zen2 (AMD64 Fam17h Zen2) Name : RETIRED_MMX_FP_INSTRUCTIONS Equiv : None Flags : None Desc : Number of MMX, SSE or x87 instructions retired. The UnitMask allows the selection of the individual classes of instructions as given in the table. Each increment represents one complete instruction. Since this event includes non-numeric instructions, it is not suitable for measuring MFLOPS. Code : 0xcb Umask-00 : 0x04 : PMU : [SSE_INSTR] : None : Number of SSE instructions (SSE, SSE2, SSE3, SSE$, SSE4A, SSE41, SSE42, AVX). Umask-01 : 0x02 : PMU : [MMX_INSTR] : None : Number of MMX instructions. Umask-02 : 0x01 : PMU : [X87_INSTR] : None : Number of X87 instructions. Modif-00 : 0x00 : PMU : [k] : monitor at priv level 0 (boolean)
did not look at the details yet same command line records correct flops on Xeon
not sure if AMD-specific or my system-sepcific done snot crash - just reports very low number