Alex313031 / Thorium-Win

Chromium fork for Windows named after radioactive element No. 90; Windows builds of https://github.com/Alex313031/Thorium
https://thorium.rocks/
BSD 3-Clause "New" or "Revised" License
1.55k stars 36 forks source link

Compiler opt #57

Closed icls1337 closed 2 months ago

icls1337 commented 1 year ago

https://en.wikipedia.org/wiki/Intel_C%2B%2B_Compiler Try compiling Thorium with it? There are tons of tests stating that this compiler is on average 30%+ faster than GCC,CLANG,MSVC...

https://community.intel.com/legacyfs/online/drupal_files/article/146679/linuxkernelbuildwhitepaper.pdf Red Flag* Software Co., Ltd started to use the Intel C++ Compiler for Linux to compile the Linux kernel in its commercial version of Linux operating system in 20041

gz83 commented 1 year ago

As far as I know, Chromium doesn't support compiling with this compiler, and our current release is compiled with clang.

@icls1337

RobRich999 commented 1 year ago

ICC Classic is proprietary and outdated at this point. Note the Intel whitepaper is from 2008. ;)

The modern Intel DPC++ compiler is built upon Clang/LLVM. It targets SYCL for hardware accelerator offload, which is not something Chromium needs. Think ML, HPC, and similar workloads.

Intel does upstream its compiler development into LLVM as applicable and appropriate. AMD does similar with AOCC.

Chromium explicitly targets LLVM, and usually recent LLVM builds at that. Many of these LLVM derivatives are a major LLVM release version or more behind.


BTW, Chromium for Linux (usually) can be built with GCC, but do not expect PGO, LTO, etc. to work.... assuming one gets a current Chromium ToT dev build to actually compile with GCC 13.x right now anyway. It is not officially supported anymore, so it is a community supported effort at best.

icls1337 commented 1 year ago

ICC Classic is proprietary and outdated at this point. Note the Intel whitepaper is from 2008. ;)

The modern Intel DPC++ compiler is built upon Clang/LLVM. It targets SYCL for hardware accelerator offload, which is not something Chromium needs. Think ML, HPC, and similar workloads.

Intel does upstream its compiler development into LLVM as applicable and appropriate. AMD does similar with AOCC.

Chromium explicitly targets LLVM, and usually recent LLVM builds at that. Many of these LLVM derivatives are a major LLVM release version or more behind.

BTW, Chromium for Linux (usually) can be built with GCC, but do not expect PGO, LTO, etc. to work.... assuming one gets a current Chromium ToT dev build to actually compile with GCC 13.x right now anyway. It is not officially supported anymore, so it is a community supported effort at best.

But can try icx?

RobRich999 commented 1 year ago

Intel DPC++ = Intel ICX = LLVM derivative

We already use LLVM, so in theory Intel DPC++ could build Chromium as long as the build revision is synced to at least whatever LLVM build revision Chromium devs are using at the time.

Intel DPC++ is provided for devs needing to keep pace with the absolute latest in SYCL and/or OpenMP development. You can view Intel's own downstream commit log here:

https://github.com/intel/llvm/commits/sycl

Otherwise, the codebase is LLVM. Chromium does not use SYCL or OpenMP, so there would no particular benefit to building Chromium with Intel DPC++ versus just using LLVM in the first place.

icls1337 commented 1 year ago

Intel DPC++ = Intel ICX = LLVM derivative

We already use LLVM, so in theory Intel DPC++ could build Chromium as long as the build revision is synced to at least whatever LLVM build revision Chromium devs are using at the time.

Intel DPC++ is provided for devs needing to keep pace with the absolute latest in SYCL and/or OpenMP development. You can view Intel's own downstream commit log here:

https://github.com/intel/llvm/commits/sycl

Otherwise, the codebase is LLVM. Chromium does not use SYCL or OpenMP, so there would no particular benefit to building Chromium with Intel DPC++ versus just using LLVM in the first place.

https://godbolt.org/z/8zM9684Ef The truth is that icx is usually faster than clang, on my machine this code is about 200% faster.

Here are more benchmarks https://www.intel.com/content/www/us/en/developer/tools/oneapi/dpc-compiler.html#tab-blade-1-0

gz83 commented 1 year ago

The chromium browser does not support the use of icx for compilation. Currently, clang is the main compiler.

@icls1337

Alex313031 commented 1 year ago

@icls1337 @gz83 @RobRich999 Yeah, trying to use icx to compile even simple targets (not the full browser), for example content_shell and chrome_sandbox, lead to errors.

icls1337 commented 1 year ago

@gz83 @Alex313031 @RobRich999 Looking forward to using it in the future as they seem to have modified the back end a lot. https://godbolt.org/z/vr7qP6zGd You can choose the version of icx, see the different optimizations.

RobRich999 commented 1 year ago

@Alex313031 @gz83 @icls1337 Microbenchmarking through Compiler Explorer is YMMV, but on average Clang 10 is showing faster than the current ICX-latest and Clang-trunk builds for the example code as of my copy-and-paste

https://godbolt.org/z/j3EhYvePv

Looks like the code example might have changed somewhere along the way, but whatever.

That is a potential LLVM regression. Looks like it happened between Clang 10.0.0 and 10.0.01. File a bug report with LLVM if interested.


If trying to build Chromium, note ICX defaults to -fp-model=fast. That might not be a good idea with the Chromium codebase. You should go into the situation expecting crashes and potentially having to instead set -fp-model=precise.

Expect likewise with -Ofast, since it is basically just -O3 -ffast-math, which often tends to cause crashes in my previous experiences over the years.


I have been down so many Chromium optimization paths over the past several years that is borderline ridiculous.

For example I can build ya' a Chromium release with multi-threaded parallel loop execution using Polly to force generate OpenMP calls. Last I knew the resulting build can even considerably inflate a benchmark or two. Still, I would not be included to use it as a daily browser. o.0

These days trying to track far too many LLVM optimizations and passes is less like rewarding and more like headache inducing IMHO, especially when the actual real-world returns are often practically falling in the noise anyway. A percent or two in a benchmark is no longer worth it with modern procs and GPU acceleration. YMMV.

A big part of the underling issue is how so many LLVM (and other compiler suite) passes and options get implemented but are never fully developed into defaults. Many are great ideas, but like most software projects, developer efforts tend to be a managed resource.


BTW, I know this is a Windows repo, but I might have something (or more?) in the pipeline to potentially help with Chromium on Linux performance. My biggest problem is feeling like dealing with the experimentation process.


For the TLDR crowd.... anyone want to donate me an AMD EPYC 9754 build server? ;p

RobRich999 commented 1 year ago

Quick FYI. I installed the Intel oneAPI DPC++/C++ Compiler 2023.2.0.20230721 on my Kubuntu 23.10-dev system. The compiler suite appears to be based upon LLVM v17, so YMMV on doing a current Chromium dev build with it.

RobRich999 commented 1 year ago

Have low expectations. I have already hit a LTO-related error, though I should have a workaround.

chromium-icx-avx2


The workaround resolved the LTO issue for the affected component(s). It is back to building.... for now? Could be awhile since it is running on my AMD 5700u 8c/16t notebook instead of a build server.

RobRich999 commented 1 year ago

Back to poking at icx. This time on my primary build server.

Using icx to build Chromium is mostly copying and moving around some files to create a local standalone icx suite. I had to borrow a few files from LLVM. IIRC, I borrowed one binary, one Linux library, and the Windows compiler-rt libraries I source from Chromium's packaged LLVM checkout anyway; not that I have tried a Windows build, yet. More details later.

I can get LTO working for most binaries and libaries, but it is borked for the chrome binary. The issue appears to track back to an internal function in an icx pass that apparently is not present in open LLVM source. Yeah, fun.

I suspect PGO should work assuming I get around to gen-erating a profile. Chromium project PGO profiles are currently formatted for LLVM v18, while icx being based upon LLVM v17 is using an older non-compatible format. I probably could source a PGO profile for an older Chromium revision, but that probably will pull in lots of mismatched functions that would be disregarded.

I am rolling a Chromium Linux build without LTO+PGO at the moment. I am targeting Haswell as a baseline for now, along with -O3 and various additional icx compiler optimization options. It will give me a starting point for experimenting with optimizations; assuming it builds. I will attempt a similar cross-compiled Chromium Windows build, too. Hopefully I will have links to post in awhile.

Mostly for curiosity, I might look into a build with auto-dispatched optimized functions, though that imposes an Intel processor requirement for dispatching into functions optimized beyond the baseline.


Okay, maybe not. LLD is complaining about missing symbols in the chrome binary. I suppose that could be related to the LTO issue as well, assuming that LTO pass can not find those functions as well. I will try "finding" them. ;)

...

Took a look. The header file include is present. Okay. Changed linker to mold as a test. Same result, so it appears to be an issue with icx clang, or at least it is treating the underlying code different than current LLVM clang. Hmmm.


Got a relatively default SSE3 build done. Had to tell the linker to ignore those couple of undefined symbols. They are related to the fencedframe sandbox ads api experiment in about:flags anyway. Whatever.

Speedometer was seemingly unimpressive. Basemark Web 3.0 was okay, I suppose. Been quite awhile since I have run a SSE3 build at default opt levels, plus without LTO+PGO, so I do not much for comparison here.

Now to pull back in my optimizations and roll another build.

RobRich999 commented 1 year ago

Not been poking at Chromium much in a few days. I will try to get back to icx in the upcoming days.

That said, I am not really expecting any significant returns, especially if we can not successfully enable LTO.