RobRich999 / Chromium_Clang

Chromium browser compiled with the Clang/LLVM compiler.
157 stars 10 forks source link

Args.gn request #17

Closed Alex313031 closed 2 years ago

Alex313031 commented 2 years ago

Hi, I'm asking the same thing here that I asked Macchrome i.e. marmaduke. Would it be possible for me to get the args.gn you're using for your linux (both the avx and regular ones) builds. Hibbiki posted his args.gn but sadly stopped building a while ago, and even recently deleted everything but his last build. At least web.archive.org had old ones, as I use his M49 build for XP machines (i'm a weird blend of bleeding edge and legacy enthusiasm), and M68 for older win 7 machines. Anyway, I am planning to do a "4 way merge" combining your optimization and avx flags (couldn't figure out how to do that on linux), with macchromes build flags, along with hibbikis HEVC (H.265) patches ported by me for linux (would love to share if your'e interested), and finally adding my own component widevine (i.e. not baked in but downloaded in chrome://components via CDM), media overlay tweaks, and bake in my own API keys for sync. I use your builds on win 7 and 10, but not on linux as I "need" want lol widevine, yours lack widevine, and his are ungoogled and always only stable, and without avx. As I've gotten deeper into compiling my own chromium and chromium OS (if you want those builds, which are newer than https://arnoldthebat.co.uk/wordpress/ I'd be happy to share.) I have taken more time to dig into what certain build flags can actually do, and think you and marmaduke have done a swell job at making chromium available to the newb masses. Thanks for posting your build's on https://chromium.woolyss.com/ and enabling chromium usage for newbs.

If you agree to sharing your args.gn and he does as well, and after testing and verification, I will be making a repo for these linux builds and will acknowledge you and him with links to your githubs in the readme. I don't plan on posting my builds anywhere (like chromium.woolyss) as I wouldn't be as studious as you and probably would have large leaps between versions.

Lastly, when and if this happens, it seems there is a limit to file size on github, do you have to pay to remove this (i.e. to upload .tar or .deb of final builds which exceed 100mb)

RobRich999 commented 2 years ago

is_clang = true is_component_build = false is_debug = false is_official_build = true use_thin_lto = true thin_lto_enable_optimizations = true symbol_level = 0 blink_symbol_level = 0 target_cpu = "x64" use_lld = true enable_widevine = false enable_remoting = false proprietary_codecs = true ffmpeg_branding = "Chrome" google_api_key = "no" google_default_client_id = "no" google_default_client_secret = "no" use_official_google_api_keys = false treat_warnings_as_errors = false clang_use_chrome_plugins = false enable_nacl = false enable_precompiled_headers = false use_vaapi = true enable_linux_installer = true enable_distro_version_check = false chrome_pgo_phase = 2 pgo_data_path = "//home//robrich//depot_tools//chromium//src//chrome//build//pgo_profiles//*.prof"

I manually download and set the PGO profile. Change the above *.prof path and filename accordingly.

python tools/update_pgo_profiles.py --target=linux update --gs-url-base=chromium-optimization-profiles/pgo_profiles

BTW, to cover the bases for others taking a look here, natively building Widevine instead of downloading it would be a likely YMMV situation. There are signed (Chrome) and unsigned (Chromium) versions. A native Widebine build will be an unsigned component that might work fine for some sites, work in a limited fashion on some sites, and outright not work on some sites.

About Linux build SIMD optimizations, edit the compiler BUILD.gn args:

depot_tools\chromium\src\build\config\compiler\BUILD.gn

Search for -msse3 and simply replace it with -mavx. If targeting AVX2 and later, you could change -march=$x64_arch to -march=haswell or similar, then comment out the -msse3 line. Procs with AVX2 also support FMA, which you could force LLVM to auto generate via -ffp-contract=fast as a cflag at the same spot.

If you want a more performant build at the expense of increased build times, then also change lto_opt_level = 0 to lto_opt_level = 2 so optimized LTO codegen happens for all build code paths. Highly recommended IMO.

On a related note, I suggest changing import_instr_limit = 5 to import_instr_limit = 30. Chrome opts for smaller binary size and faster building. Meanwhile, ChromeOS uses 30 on x86/64, and there is a ChromeOS developer doc somewhere showing 30 to be a good balance. The actual default for LLVM is 100, though the performance between 30 and 100 is largely in the noise for Chromium.

I comment out all the stuff related to LTO build caching, too, as I do not use incremental-style building. The LTO cache just wastes drive writes and burns disc space for me.

If using PGO for Linux builds, change cflags = [ "-Oz" ] + common_optimize_on_cflags to cflags = [ "-O2" ] + common_optimize_on_cflags. Setting -02 will let the profile properly guide the optimizations. ;)

I do further optimizations, for example using Polly (https://polly.llvm.org/) during LTO codegen. Look for common_optimize_on_ldflags = [] around line 2025 and add:

common_optimize_on_ldflags += [ "-Wl,-mllvm,-polly", "-Wl,-mllvm,-polly-detect-profitability-min-per-loop-insts=40", "-Wl,-mllvm,-polly-invariant-load-hoisting", "-Wl,-mllvm,-polly-vectorizer=stripmine", ]

However, note that using Polly would require building your own LLVM checkout either externally to Chromium or by modifying Chromium's own Clang build script. I do this a little different, but here is a basic approach for modifying Chromium's Clang build script.

depot_tools\chromium\src\tools\clang\scripts\build.py

Add polly to the build targets, plus enable PIC and plugins. Starts around line 567.

projects = 'clang;compiler-rt;lld;chrometools;clang-tools-extra;polly'

'-DLLVM_ENABLE_PIC=ON', '-DCLANG_PLUGIN_SUPPORT=ON',

Then build LLVM. The Chromium script can auto download a GCC checkout to build LLVM, though I just go ahead with using my local default GCC checkout by adding "--gcc-toolchain=/usr", There are other LLVM build options like LTO and PGO, but the LLVM performance difference is not worth the extra build time IMO.

python /home/robrich/depot_tools/chromium/src/tools/clang/scripts/build.py --without-android --without-fuchsia --llvm-force-head-revision --disable-asserts --gcc-toolchain=/usr

I am not sure about file limits on GitHub. I suppose it could vary even by file type. ?? I know GitHub allows my 100MB+ zip archives for Windows builds, and I just have a basic free account.

If I missed anything, or you want to know more, just ask. :)

Alex313031 commented 2 years ago

Thanks m8! Went above and beyond with the explanations, very useful, I'm glad you shared. Macchrome wasn't as nice, which seems strange considering I asked humbly and chromium is an open source project anyway but idk. Will be using most of this and you made stuff alot easier instead of me having of track down how to do some of these things. I will probably exclude polly, as I already have an LLVM build that I like, took a while to get right, and don't wanna f**ck with. So even with the issues with some site playback, why do you disable widevine? That's the one and only thing I wish your windows builds had. Otherwise v good job and runs even faster than chrome in some of my tests, along with the open source goodness and much prettier blue/bluer/bluest logo that makes me love chromium in general.

Alex313031 commented 2 years ago

Also thanks for explaining the widevine thing for other people. You can use chrome or chromiums signed/unsigned widevine, and either bake it in, or componentize it, in which case will allow it to be updated in chrome://components and downloads it into the user profile along with FLOC, safe browsing, and other components. Whats also nice is that by copying this WidevineCDM dir to the dir of an official linux distro's install folder, like debian (ubuntu and arch have widevine baked in), you can enable widewine on these distros. I usually don't mess with it but am building my own and componentizing it (which disables the auto update, but still lets you see and use the generated files). Oh and a last question, why do you disable nacl. I know its technically deprecated, and adds some minutes to build times, but many apps (not extensions) from the chrome web store still use pnacl to work. If this wasn't the case, I would exclude it too, since it has security vulnerabilities and chrome apps are getting phased out anyway (wish they weren't). Just wondering.

RobRich999 commented 2 years ago

IIRC, I have a more detailed explanation somewhere in the woolyss.com comments, but the basic premise is ya'll get the same builds I use. I have no immediate use for NaCL, Widevine, H.265, etc. Also, trying to build Widevine off the main trunk in particular used to (maybe still does?) often result in a non-working Widevine component.

RobRich999 commented 2 years ago

BTW, if not using Polly, just use the Clang/LLVM binary checkout already provided by the Chromium project. It is LTO and PGO optimized to boot, as Google certainly has more build resources than me. ;)

Alex313031 commented 2 years ago

I do use it (chromium clang), I have just tweaked some things. And no, in my experience, widevine has worked just fine in either componentized or baked in form. I only relatively recently started building with widevine, I wanna say around M90. It may have been a short or long running bug that you've experienced, as you've been building alot more often and longer than me, but it never seems faulty and it makes it where I don't have to use another browser for netflix, hulu, vudu. And chromium lets you make pseudo PWA's now so its even nicer to have a standalone window with your content in it.

Alex313031 commented 2 years ago

Also, whats your opinion on FLOC. And about the builds, will setting avx actually build avx code that's optimized and can only run on avx processors, or does it just mark it as avx only? The reason I ask this is because, when using your avx builds on windows I don't get the CPU heat thats common of avx instructions, but maybe it's just not as intense as alot of avx only applications are.

RobRich999 commented 2 years ago

I know Widevine used to be a mess to build years ago, though yeah, I have been building much longer than even my GitHub repository suggests. All the way to back to days of MSVC builds for Windows.

FLoC is going nowhere fast IMO. I could be wrong, as I am not a web developer, but I suspect third-party cookies and similar existing tracking techniques will continue for at least a few more years.

Specifying particular SIMD support allows Clang/LLVM to build up to that instruction level. If you set AVX, then code could be compiled with integer, FPU, SSE whatever, or up to AVX depending upon numerous optimization, passes. The nice part about setting AVX is native VEX encoding, which can help with lowering the penalties between mixed SSE and AVX register usage.

https://en.wikipedia.org/wiki/VEX_prefix https://john-h-k.github.io/VexTransitionPenalties.html

Much of Chromium by default is built for up to SSE3 SIMD support, but various underlying components support automatic SIMD multi-versioning using dynamic dispatch for different SIMD support levels. While that is nice, looking back to my previous comments, there could be performance penalties involved when running mixed SSE and AVX code. By setting Clang/LLVM to natively generate up to AVX code, we can now use VEX encoding across the board for SIMD, thus potentially lowering those penalties.

https://en.wikipedia.org/wiki/SIMD#SIMD_multi-versioning https://en.wikipedia.org/wiki/Dynamic_dispatch

Now getting back to the "much of Chromium" aspect, the actual core Chromium source base is largely scalar code. We are leaning heavily on compiler optimization passes, with autovectorization in particular, to hopefully optimize scaler code for SIMD processing.

https://en.wikipedia.org/wiki/Automatic_vectorization

Remember when I recommended to change LTO codegen to -O2 across the board? I do that to get additional optimization passes, again including autovectorization, working, for all Chromium code. However, the reality is one could build the entire browser with LTO codegen set to -O0 without critically huge performance penalties in the core browser components due to the project's substantial use of scalar code. YMMV.

Branching out further in the codebase, you will find more native SIMD-targeted code in Chromium with components like multimedia, cryptography, compression, and similar. Those represent some of the more computationally intensive operations when browsing, though unless you are doing something intensive like watching 4K+ video with software decoding, then you might not notice much of a real-word usability difference.

So.... long story short. Yes, setting -mavx generates AVX instructions plus encodes SSE instructions with VEX. The remaining question is how much of the code being generated is actually comprised of SIMD instructions. That depends upon the compiler optimization levels, but the reality is much of the Chromium codebase going into and coming out of Clang/LLVM is scalar instead of SIMD, thus why you might not notice a substantial usability difference. For me, it is about eeking out the extra few percent of performance instead of worrying about a little binary bloat and a little extra build time. ;)

Alex313031 commented 2 years ago

Thanks for providing wikipedia articles. I'm white and nerdy like weird al yankovic, cause' I edit and spend too much time on wikipedia. Also, I believe I mightz be stupidz, as I don't know what LTO codegen -O does. Also, I may be just missing something but how are you enabling PGO? I ran the python script but all I see is a .profdata in pgo_profiles, do I just rename this to .prof?

RobRich999 commented 2 years ago

Yeah, *.profdata sounds more like it. I pulled that copy-and-paste off notes from a different system instead of my usual build boxes.

You set these to get PGO working:

chrome_pgo_phase = 2 pgo_data_path = "//home//robrich//depot_tools//chromium//src//chrome//build//pgo_profiles//*.profdata"

You will have change the pgo_data_path for wherever you placed the profdata file.

LTO codegen is the level of optimization used the LLD linker's link-time optimization process. Clang runs a subset of compiler passes when using LTO, then LLD does much of the deferred actual compiling during link time since there is now more information present about function imports, inlining, interprocedual passes, etc.

https://llvm.org/docs/LinkTimeOptimization.html

randomhydrosol commented 2 years ago

Hello, could I get your flags for windows?

RobRich999 commented 2 years ago

If doing a Windows build natively on a Windows system:

is_component_build = false is_debug = false is_official_build = true use_thin_lto = true thin_lto_enable_optimizations = true symbol_level = 0 blink_symbol_level = 0 target_cpu = "x64" enable_widevine = false enable_remoting = false proprietary_codecs = true ffmpeg_branding = "Chrome" google_api_key = "no" google_default_client_id = "no" google_default_client_secret = "no" use_official_google_api_keys = false treat_warnings_as_errors = false clang_use_chrome_plugins = false enable_resource_allowlist_generation = false enable_nacl = false enable_precompiled_headers = false chrome_pgo_phase = 2 pgo_data_path = "D:\depot_tools\chromium\src\chrome\build\pgo_profiles*.profdata"

Most of the optimization concepts from the Linux discussion above carries over to Windows, though of note off the top of my head, you would need to move Polly up a few lines in depot_tools\chromium\src\build\config\compiler\BUILD.gn, instead set the SIMD level in depot_tools\chromium\src\build\config\win\BUILD.gn, and change the PGO profile update target to win64.

python D:/depot_tools/chromium/src/tools/update_pgo_profiles.py --target=win64 update --gs-url-base=chromium-optimization-profiles/pgo_profiles

Alex313031 commented 2 years ago

@RobRich999

  1. Do I just set that whole cflag line using set or export?
  2. Also, the lto_opt_level = 2, is that just an arg for gn?
  3. import_instr_limit > Is this supposed to be modified in the compiler build.gn?
  4. "Lastly, cflags = [ "-Oz" ] + common_optimize_on_cflags to cflags = [ "-O2" ] + common_optimize_on_cflags. Setting -02" is it supposed to be 02 or O2?
RobRich999 commented 2 years ago

No sets or exports for optimizations. All those are modifications in these files:

depot_tools\chromium\src\build\config\compiler\BUILD.gn depot_tools\chromium\src\build\config\win\BUILD.gn

Skim these two files, and they should make more sense about the recommended changes.

O for "Optimization" level, so O2. ;)

rainxh11 commented 2 years ago

setting the google api keys in the Env Variables no longer seem to work with your builds, why is that? and how can i use api keys with your builds now

Paukan777 commented 2 years ago

Many thanks for this manual. Just built avx2 optimized .deb

RobRich999 commented 2 years ago

setting the google api keys in the Env Variables no longer seem to work with your builds, why is that? and how can i use api keys with your builds now

Unknown. I do not use API keys. I know Google is rate limiting "developer" API keys now, but I would the keys still should be recognized if valid. Consider asking in the woolyss.com comments section.

RobRich999 commented 2 years ago

Many thanks for this manual. Just built avx2 optimized .deb

Excellent! :)

RobRich999 commented 2 years ago

Update on Widevine support. I have enabled the unsigned Widevine component is my just pushed v95.0.4619.0 builds, and I will continue to do for future builds assuming it builds and works okay. IOW, I am not going to spend much fixing any Widevine issues, as I have little to no use for it. It either builds and works, or it does not.

The Shaka Player demo for Widvine worked, and I was able to stream Hulu content.

RobRich999 commented 2 years ago

BTW, if Widevine does not work at first, you might need to manually update the component.

chrome://components/