Dr-Noob / cpufetch

Simple yet fancy CPU architecture fetching tool
GNU General Public License v2.0
1.88k stars 103 forks source link

wrong TFLOP/s with --accurate-pp in qemu user-mode emulation #293

Open lilyanatia opened 3 days ago

lilyanatia commented 3 days ago

correct output:

$ cpufetch --accurate-pp

                                                              Name:                Intel Xeon E5-4657L v2
                                                              Microarchitecture:   Ivy Bridge
                                                              Technology:          22nm
                                                              Max Frequency:       2.900 GHz
                                                              Sockets:             4
                                                              Cores:               12 cores (24 threads)
                                                              Cores (Total):       48 cores (96 threads)
                                                              AVX:                 AVX
                                                              FMA:                 No
                                                              L1i Size:            32KB (1.5MB Total)
                                                              L1d Size:            32KB (1.5MB Total)
                                                              L2 Size:             256KB (12MB Total)
                                                              L3 Size:             30MB (120MB Total)
                                                              Peak Performance:    1.04 TFLOP/s

running in qemu-x86_64 results in 4x the TFLOP/s:

$ qemu-x86_64 /usr/bin/cpufetch --accurate-pp

                                                              Name:                QEMU TCG CPU version 2.5+
                                                              Hypervisor:          QEMU
                                                              Microarchitecture:   K7
                                                              Technology:          Unknown
                                                              Max Frequency:       2.900 GHz
                                                              Sockets:             9
                                                              Cores:               1 core
                                                              Cores (Total):       96 cores
                                                              AVX:                 AVX,AVX2
                                                              FMA:                 FMA3
                                                              L1i Size:            64KB
                                                              L1d Size:            64KB
                                                              L2 Size:             512KB
                                                              L3 Size:             16MB
                                                              Peak Performance:    4.15 TFLOP/s

I'm not sure if it's possible to fix --accurate-pp under emulation other than actually running a benchmark, but perhaps a warning that "Peak Performance", even with --accurate-pp, might be completely wrong when a hypervisor is detected would be a good idea.

Dr-Noob commented 3 days ago

Hey, thanks for the report.

I disagree with your interpretation. It's not that --accurate-pp is "wrong". If you compare the output from QEMU vs no QEMU you'll see that almost everything (except max frequency) is "wrong". And it's not wrong, it's simply QEMU tampering the values (VMs can report whatever values they want).

In particular, in this case it's the microarchitecture and the number of sockets what is making cpufetch think that the peak performance is higher, but cpufetch can do nothing about that, except maybe warning the user when running under a VM about these effects. I would expect the user to be aware of this though, that's why no warning is displayed right now.