Eclipse-Community / r3dfox

r3dfox is a modern Firefox based web browser for Windows Vista & 7. SourceForge link for downloading with older browsers. https://sourceforge.net/projects/r3dfox/
https://eclipse.cx/projects/r3dfox
Other
197 stars 7 forks source link

more rustflags opts #42

Closed SashaXser closed 5 months ago

SashaXser commented 6 months ago

Added the tune-cpu option as well as target-cpu

K4sum1 commented 6 months ago

I'm not sure how these work exactly. Would Haswell mean it's specifically for Haswell, or just Haswell+? Also is there anything like this for the other variants?

SashaXser commented 6 months ago

I'm not sure how these work exactly. Would Haswell mean it's specifically for Haswell, or just Haswell+? Also is there anything like this for the other variants?

Haswell = x86-64-v3

You can pick something from this list::

    native                 
    alderlake
    amdfam10
    athlon
    athlon-4
    athlon-fx
    athlon-mp
    athlon-tbird
    athlon-xp
    athlon64
    athlon64-sse3
    atom
    atom_sse4_2
    atom_sse4_2_movbe
    barcelona
    bdver1
    bdver2
    bdver3
    bdver4
    bonnell
    broadwell
    btver1
    btver2
    c3
    c3-2
    cannonlake
    cascadelake
    cooperlake
    core-avx-i
    core-avx2
    core2
    core_2_duo_sse4_1
    core_2_duo_ssse3
    core_2nd_gen_avx
    core_3rd_gen_avx
    core_4th_gen_avx
    core_4th_gen_avx_tsx
    core_5th_gen_avx
    core_5th_gen_avx_tsx
    core_aes_pclmulqdq
    core_i7_sse4_2
    corei7
    corei7-avx
    emeraldrapids
    generic
    geode
    goldmont
    goldmont-plus
    goldmont_plus
    grandridge
    graniterapids
    graniterapids-d
    graniterapids_d
    haswell
    i386
    i486
    i586
    i686
    icelake-client
    icelake-server
    icelake_client
    icelake_server
    ivybridge
    k6
    k6-2
    k6-3
    k8
    k8-sse3
    knl
    knm
    lakemont
    meteorlake
    mic_avx512
    nehalem
    nocona
    opteron
    opteron-sse3
    penryn
    pentium
    pentium-m
    pentium-mmx
    pentium2
    pentium3
    pentium3m
    pentium4
    pentium4m
    pentium_4
    pentium_4_sse3
    pentium_ii
    pentium_iii
    pentium_iii_no_xmm_regs
    pentium_m
    pentium_mmx
    pentium_pro
    pentiumpro
    prescott
    raptorlake
    rocketlake
    sandybridge
    sapphirerapids
    sierraforest
    silvermont
    skx
    skylake
    skylake-avx512
    skylake_avx512
    slm
    tigerlake
    tremont
    westmere
    winchip-c6
    winchip2
    x86-64                  - This is the default target CPU for the current build target (currently x86_64-unknown-linux-gnu).
    x86-64-v2
    x86-64-v3
    x86-64-v4
    yonah
    znver1
    znver2
    znver3
    znver4
K4sum1 commented 6 months ago

@Alex313031 Since you know everything about this I'm letting you determine if it goes though.

I won't be building AVX2 until I get this in a non-beta state, so I'm not going to rush this.

SashaXser commented 6 months ago

What build are you using? For AVX, x86-64-v2 or sandybridge may be suitable

K4sum1 commented 6 months ago

What build are you using? For AVX, x86-64-v2 or sandybridge may be suitable

The #'s mean it's commented out. The one not commented out is SSE3. Only very early Athlon 64 CPUs that barely handle 7 let alone a modern browser don't support SSE3, so it's basically the same as building for SSE2.

SashaXser commented 6 months ago

What build are you using? For AVX, x86-64-v2 or sandybridge may be suitable

The #'s mean it's commented out. The one not commented out is SSE3. Only very early Athlon 64 CPUs that barely handle 7 let alone a modern browser don't support SSE3, so it's basically the same as building for SSE2.

Can you try compiling with this RUSTFLAGS? I'm curious to see what happens :/

export RUSTFLAGS="-C target-cpu=athlon64-sse3 -C target-feature=+sse3 -C codegen-units=1 -Z tune-cpu=athlon64-sse3"
K4sum1 commented 6 months ago

Can you try compiling with this RUSTFLAGS? I'm curious to see what happens :/

Was going to pass, but then I tested the new build and noticed it's a bit fucked and I need to recompile, so I might as well

K4sum1 commented 6 months ago

I get the error that the option Z is only accepted on the nightly compiler. So I did rustup default nightly and I still get the error.

K4sum1 commented 6 months ago

So I had to obtain 2024-02-04 nightly rust, rename it to stable, and it seems to be working.

Edit: rustup override seems to also work, and is less jank.

K4sum1 commented 6 months ago

So I pushed a new mozconfig with these CPU configs, and I picked what seemed the best for SSE2 and SSE4. Lemme know if I should change any.

Also if you want some sort of credit, I can add your name to the mozconfig file.

K4sum1 commented 6 months ago

I was about to start building all of these, but I figured I might as well try testing Mercury 115 ESR SSE3, SSE4, AVX, and AVX2 on hardware. Very similar build config, so should be comparable.

I found that SSE4 is actually the winner, but I think SSE3 is negligible. Considering there was no difference in other testing before I might just go back to SSE2 for main build.

JetStream was consistently 76 and Speedometer was consistently 5.3. SSE3 got 350 in MotionMark, SSE4 353, AVX 340, and AVX2 335.

If there's a better way to test please let me know

SashaXser commented 6 months ago

I was about to start building all of these, but I figured I might as well try testing Mercury 115 ESR SSE3, SSE4, AVX, and AVX2 on hardware. Very similar build config, so should be comparable.

I found that SSE4 is actually the winner, but I think SSE3 is negligible. Considering there was no difference in other testing before I might just go back to SSE2 for main build.

JetStream was consistently 76 and Speedometer was consistently 5.3. SSE3 got 350 in MotionMark, SSE4 353, AVX 340, and AVX2 335.

If there's a better way to test please let me know

I'll figure something out now, but the Rust optimization won't affect JS much, it's more geared towards WebRender.

So I pushed a new mozconfig with these CPU configs, and I picked what seemed the best for SSE2 and SSE4. Lemme know if I should change any.

Also if you want some sort of credit, I can add your name to the mozconfig file.

At your discretion.

SashaXser commented 5 months ago

PGO + BOLT give a nice performance boost, but I haven't found a complete implementation guide.

K4sum1 commented 5 months ago

This uses PGO, or should. Not sure what BOLT is.

jonm58 commented 5 months ago

This uses PGO, or should. Not sure what BOLT is.

Bolt(Binary Optimization and Layout Tool) by Facebook

K4sum1 commented 5 months ago

I rebased and this patch is kinda fucked now, so closing. Since I basically manually applied it anyways doesn't really matter.

Also opened https://github.com/Eclipse-Community/r3dfox/issues/43 for more suggestions.

SashaXser commented 5 months ago

I rebased and this patch is kinda fucked now, so closing. Since I basically manually applied it anyways doesn't really matter.

Also opened #43 for more suggestions.

Well, I've lost interest in offering you anything anyway.

K4sum1 commented 5 months ago

Well, I've lost interest in offering you anything anyway.

How come?

jonm58 commented 5 months ago

I rebased and this patch is kinda fucked now, so closing. Since I basically manually applied it anyways doesn't really matter. Also opened #43 for more suggestions.

Well, I've lost interest in offering you anything anyway.

?