Closed SashaXser closed 5 months ago
I'm not sure how these work exactly. Would Haswell mean it's specifically for Haswell, or just Haswell+? Also is there anything like this for the other variants?
I'm not sure how these work exactly. Would Haswell mean it's specifically for Haswell, or just Haswell+? Also is there anything like this for the other variants?
You can pick something from this list::
native
alderlake
amdfam10
athlon
athlon-4
athlon-fx
athlon-mp
athlon-tbird
athlon-xp
athlon64
athlon64-sse3
atom
atom_sse4_2
atom_sse4_2_movbe
barcelona
bdver1
bdver2
bdver3
bdver4
bonnell
broadwell
btver1
btver2
c3
c3-2
cannonlake
cascadelake
cooperlake
core-avx-i
core-avx2
core2
core_2_duo_sse4_1
core_2_duo_ssse3
core_2nd_gen_avx
core_3rd_gen_avx
core_4th_gen_avx
core_4th_gen_avx_tsx
core_5th_gen_avx
core_5th_gen_avx_tsx
core_aes_pclmulqdq
core_i7_sse4_2
corei7
corei7-avx
emeraldrapids
generic
geode
goldmont
goldmont-plus
goldmont_plus
grandridge
graniterapids
graniterapids-d
graniterapids_d
haswell
i386
i486
i586
i686
icelake-client
icelake-server
icelake_client
icelake_server
ivybridge
k6
k6-2
k6-3
k8
k8-sse3
knl
knm
lakemont
meteorlake
mic_avx512
nehalem
nocona
opteron
opteron-sse3
penryn
pentium
pentium-m
pentium-mmx
pentium2
pentium3
pentium3m
pentium4
pentium4m
pentium_4
pentium_4_sse3
pentium_ii
pentium_iii
pentium_iii_no_xmm_regs
pentium_m
pentium_mmx
pentium_pro
pentiumpro
prescott
raptorlake
rocketlake
sandybridge
sapphirerapids
sierraforest
silvermont
skx
skylake
skylake-avx512
skylake_avx512
slm
tigerlake
tremont
westmere
winchip-c6
winchip2
x86-64 - This is the default target CPU for the current build target (currently x86_64-unknown-linux-gnu).
x86-64-v2
x86-64-v3
x86-64-v4
yonah
znver1
znver2
znver3
znver4
@Alex313031 Since you know everything about this I'm letting you determine if it goes though.
I won't be building AVX2 until I get this in a non-beta state, so I'm not going to rush this.
What build are you using? For AVX, x86-64-v2 or sandybridge may be suitable
What build are you using? For AVX, x86-64-v2 or sandybridge may be suitable
The #'s mean it's commented out. The one not commented out is SSE3. Only very early Athlon 64 CPUs that barely handle 7 let alone a modern browser don't support SSE3, so it's basically the same as building for SSE2.
What build are you using? For AVX, x86-64-v2 or sandybridge may be suitable
The #'s mean it's commented out. The one not commented out is SSE3. Only very early Athlon 64 CPUs that barely handle 7 let alone a modern browser don't support SSE3, so it's basically the same as building for SSE2.
Can you try compiling with this RUSTFLAGS? I'm curious to see what happens :/
export RUSTFLAGS="-C target-cpu=athlon64-sse3 -C target-feature=+sse3 -C codegen-units=1 -Z tune-cpu=athlon64-sse3"
Can you try compiling with this RUSTFLAGS? I'm curious to see what happens :/
Was going to pass, but then I tested the new build and noticed it's a bit fucked and I need to recompile, so I might as well
I get the error that the option Z is only accepted on the nightly compiler. So I did rustup default nightly and I still get the error.
So I had to obtain 2024-02-04 nightly rust, rename it to stable, and it seems to be working.
Edit: rustup override seems to also work, and is less jank.
So I pushed a new mozconfig with these CPU configs, and I picked what seemed the best for SSE2 and SSE4. Lemme know if I should change any.
Also if you want some sort of credit, I can add your name to the mozconfig file.
I was about to start building all of these, but I figured I might as well try testing Mercury 115 ESR SSE3, SSE4, AVX, and AVX2 on hardware. Very similar build config, so should be comparable.
I found that SSE4 is actually the winner, but I think SSE3 is negligible. Considering there was no difference in other testing before I might just go back to SSE2 for main build.
JetStream was consistently 76 and Speedometer was consistently 5.3. SSE3 got 350 in MotionMark, SSE4 353, AVX 340, and AVX2 335.
If there's a better way to test please let me know
I was about to start building all of these, but I figured I might as well try testing Mercury 115 ESR SSE3, SSE4, AVX, and AVX2 on hardware. Very similar build config, so should be comparable.
I found that SSE4 is actually the winner, but I think SSE3 is negligible. Considering there was no difference in other testing before I might just go back to SSE2 for main build.
JetStream was consistently 76 and Speedometer was consistently 5.3. SSE3 got 350 in MotionMark, SSE4 353, AVX 340, and AVX2 335.
If there's a better way to test please let me know
I'll figure something out now, but the Rust optimization won't affect JS much, it's more geared towards WebRender.
So I pushed a new mozconfig with these CPU configs, and I picked what seemed the best for SSE2 and SSE4. Lemme know if I should change any.
Also if you want some sort of credit, I can add your name to the mozconfig file.
At your discretion.
PGO + BOLT give a nice performance boost, but I haven't found a complete implementation guide.
This uses PGO, or should. Not sure what BOLT is.
This uses PGO, or should. Not sure what BOLT is.
Bolt(Binary Optimization and Layout Tool) by Facebook
I rebased and this patch is kinda fucked now, so closing. Since I basically manually applied it anyways doesn't really matter.
Also opened https://github.com/Eclipse-Community/r3dfox/issues/43 for more suggestions.
I rebased and this patch is kinda fucked now, so closing. Since I basically manually applied it anyways doesn't really matter.
Also opened #43 for more suggestions.
Well, I've lost interest in offering you anything anyway.
Well, I've lost interest in offering you anything anyway.
How come?
I rebased and this patch is kinda fucked now, so closing. Since I basically manually applied it anyways doesn't really matter. Also opened #43 for more suggestions.
Well, I've lost interest in offering you anything anyway.
?
Added the tune-cpu option as well as target-cpu