Open matthiaskrgr opened 4 years ago
Assembly doesn't look suspicious: https://godbolt.org/z/BTxUDA
Right now I have Windows machine with totally different CPU nearby and it doesn't reproduce:
$ RUSTFLAGS="" cargo bench
Compiling bench v0.1.0 (D:\msys64\home\mateusz\bench)
Finished bench [optimized] target(s) in 2.44s
Running target\release\deps\bench-2ddd0f70613232dd.exe
running 4 tests
test ends_with_char ... bench: 288 ns/iter (+/- 16)
test ends_with_str ... bench: 487 ns/iter (+/- 9)
test starts_with_char ... bench: 285 ns/iter (+/- 3)
test starts_with_str ... bench: 289 ns/iter (+/- 15)
test result: ok. 0 passed; 0 failed; 0 ignored; 4 measured; 0 filtered out
$ RUSTFLAGS="-Ctarget-cpu=skylake" cargo bench
Compiling bench v0.1.0 (D:\msys64\home\mateusz\bench)
Finished bench [optimized] target(s) in 0.86s
Running target\release\deps\bench-2ddd0f70613232dd.exe
running 4 tests
test ends_with_char ... bench: 288 ns/iter (+/- 9)
test ends_with_str ... bench: 413 ns/iter (+/- 8)
test starts_with_char ... bench: 287 ns/iter (+/- 18)
test starts_with_str ... bench: 287 ns/iter (+/- 2)
test result: ok. 0 passed; 0 failed; 0 ignored; 4 measured; 0 filtered out
cpuinfo:
processor : 15
vendor_id : AuthenticAMD
cpu family : 23
model : 8
model name : AMD Ryzen 7 2700X Eight-Core Processor
stepping : 2
cpu MHz : 3700.000
cache size : 16384 KB
physical id : 0
siblings : 16
core id : 7
cpu cores : 8
apicid : 15
initial apicid : 15
fpu : yes
fpu_exception : yes
cpuid level : 13
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 clflush mmx fxsr sse sse2 ht syscall nx mmxext fxsr_opt pdpe1gb rdtscp lm pni cx16 lahf_lm cmp_legacy svm extapic cr8_legacy abm sse4a misalignsse 3dnowprefetch osvw skinit wdt tce topoext perfctr_core perfctr_nb perfctr_l2
clflush size : 64
cache_alignment : 64
address sizes : 48 bits physical, 48 bits virtual
power management: ts ttp tm hwpstate cpb eff_freq_ro
rustc -vV
:
rustc 1.43.0-nightly (6fd8798f4 2020-02-25)
binary: rustc
commit-hash: 6fd8798f4de63328d743eb2a9a9c12e202a4a182
commit-date: 2020-02-25
host: x86_64-pc-windows-gnu
release: 1.43.0-nightly
LLVM version: 9.0
With -Ctarget-cpu=skylake
there are AVX instructions generated. Maybe the CPU downclocks a lot since it's low TDP CPU? Do you have other machines to test it on?
I was running these benchmarks:
It turned out that I got a major performance drop with
-Ctarget-cpu=native
(-Ctarget-cpu=skylake
in my case).RUSTFLAGS="-Ctarget-cpu=skylake" cargo bench
RUSTFLAGS="" cargo bench
By generating code "optimized" for my machine, perf dropped from 539 ns/iter to 1,033 ns/iter for
ends_with_str
:(Meta
cpu info:
rustc --version --verbose
: