PF4Public / gentoo-overlay

Personal Gentoo overlay
78 stars 18 forks source link

www-client/ungoogled-chromium: 109.0.5414.74: crashes on load #197

Closed baconsalad closed 1 year ago

baconsalad commented 1 year ago

On load the browser window comes up as crashed, continuously tries to reload in the background and never succeeds.

[23957:23957:0109/175547.856369:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[23957:23957:0109/175547.873953:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[23957:23957:0109/175547.892552:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[23957:23957:0109/175547.909860:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[23957:23957:0109/175547.926557:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[23957:23957:0109/175547.943069:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.

current build flags

www-client/ungoogled-chromium::pf4public -cfi clang -cups custom-cflags -enable-driver -hangouts hevc official optimize-thinlto optimize-webui proprietary-codecs -screencast -suid -system-ffmpeg -system-harfbuzz -system-icu -system-jsoncpp -system-libevent -system-libvpx -system-openh264 -system-openjpeg tcmalloc thinlto vaapi vdpau -widevine
baconsalad commented 1 year ago

How many people using ryzens? I have a 3950X. also have this in my make.conf CFLAGS="-O2 -pipe -march=znver2 --param l1-cache-line-size=64 --param l1-cache-size=32"

uazo commented 1 year ago

Do you build with cfi?

sorry if I intrude, actually I don't know the toolchain you use to build but, in the gn:

build/config/sanitizers/sanitizers.gni:

  is_cfi = is_official_build && is_clang &&
           ((target_os == "linux" && target_cpu == "x64") ||
            (is_chromeos && is_chromeos_device))

so cfi might be active. you can check it with

gn args out/xxx --list

out of curiosity, can you check?

PF4Public commented 1 year ago

@uazo Thanks for stopping by!

cfi has a separate setting, which should override the defaults: https://github.com/PF4Public/gentoo-overlay/blob/4a4f1cd19ca5d581fc88ec43f2d26edc9627c4ca/www-client/ungoogled-chromium/ungoogled-chromium-109.0.5414.119_p1.ebuild#L967-L972

So, user has a direct way of enabling/disabling cfi, which is explicitly disabled by default.

arbitrary-dev commented 1 year ago

How many people using ryzens?

MeToo 5900HX

CFLAGS="-march=znver3 -O2 -pipe"
EsmailELBoBDev2 commented 1 year ago

Hi, I have segfault problem:

[25143:25143:0129/103913.716665:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[25143:25143:0129/103913.803737:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[25143:25143:0129/103913.844201:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[25143:25143:0129/103913.852138:ERROR:service_worker_task_queue.cc(232)] DidStartWorkerFail ocaahdebbfolfmndjeplogmgcagdmblk: 3
[25143:25143:0129/103913.901119:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[25143:25143:0129/103913.938803:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[25143:25143:0129/103913.972859:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[25143:25143:0129/103913.988632:ERROR:service_worker_task_queue.cc(232)] DidStartWorkerFail gebbhagfogifgggkldgodflihgfeippi: 3
[25143:25143:0129/103914.010903:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[25143:25143:0129/103914.044230:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[25143:25143:0129/103914.079744:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[25143:25143:0129/103914.084886:ERROR:service_worker_task_queue.cc(232)] DidStartWorkerFail oldceeleldhonbafppcapldpdifcinji: 3
[25143:25143:0129/103914.116492:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[25143:25143:0129/103914.151274:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[25143:25143:0129/103914.183998:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[25143:25143:0129/103914.217270:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[25143:25143:0129/103914.250843:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[25143:25143:0129/103914.286194:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[25143:25143:0129/103914.318750:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[25143:25143:0129/103914.352624:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[25143:25143:0129/103914.386052:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[25143:25143:0129/103914.419570:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[25143:25143:0129/103914.452646:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[25143:25143:0129/103914.487740:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
[25143:25143:0129/103914.520068:ERROR:network_service_instance_impl.cc(539)] Network service crashed, restarting service.
Segmentation fault

Feel free to ask for more logs/info. I'm kind of new to gentoo so I'm unsure what to send

arbitrary-dev commented 1 year ago

I'm unsure what to send

I'd guess, you better compile with debug use-flag and share the stacktrace.

arbitrary-dev commented 1 year ago

Here's the very weird part: I also get the same error when rebuilding 108.0.5359.124_p1 with my current system!

I rebuild mine and it works just fine. Though I utilize ccache, so I don't know...

   Info about currently installed ebuild:

   * www-client/ungoogled-chromium-108.0.5359.124_p1
   Install date: Sun Jan 29 04:10:47 2023
   USE="cups hangouts optimize-thinlto optimize-webui pgo pulseaudio qt5 system-openjpeg thinlto vaapi wayland -+X -cfi -+clang -convert-dict -cpu_flags_arm_neon -custom-cflags -debug -enable-driver -gtk4 -headless -hevc -js-type-check -kerberos -+official -pic -+proprietary-codecs -screencast -selinux -suid -+system-av1 -+system-ffmpeg -+system-harfbuzz -+system-icu -+system-jsoncpp -+system-libevent -+system-libusb -system-libvpx -+system-openh264 -+system-png -+system-re2 -+system-snappy -vdpau -widevine"
   CFLAGS="-march=znver3 -pipe -Wno-unknown-warning-option -Wno-builtin-macro-redefined"   CXXFLAGS="-march=znver3 -pipe -Wno-unknown-warning-option -Wno-builtin-macro-redefined"   LDFLAGS="-Wl,--as-needed -Wl,--thinlto-jobs=17"
thubble commented 1 year ago

Here's the very weird part: I also get the same error when rebuilding 108.0.5359.124_p1 with my current system!

I rebuild mine and it works just fine. Though I utilize ccache, so I don't know...

   Info about currently installed ebuild:

   * www-client/ungoogled-chromium-108.0.5359.124_p1
   Install date: Sun Jan 29 04:10:47 2023
   USE="cups hangouts optimize-thinlto optimize-webui pgo pulseaudio qt5 system-openjpeg thinlto vaapi wayland -+X -cfi -+clang -convert-dict -cpu_flags_arm_neon -custom-cflags -debug -enable-driver -gtk4 -headless -hevc -js-type-check -kerberos -+official -pic -+proprietary-codecs -screencast -selinux -suid -+system-av1 -+system-ffmpeg -+system-harfbuzz -+system-icu -+system-jsoncpp -+system-libevent -+system-libusb -system-libvpx -+system-openh264 -+system-png -+system-re2 -+system-snappy -vdpau -widevine"
   CFLAGS="-march=znver3 -pipe -Wno-unknown-warning-option -Wno-builtin-macro-redefined"   CXXFLAGS="-march=znver3 -pipe -Wno-unknown-warning-option -Wno-builtin-macro-redefined"   LDFLAGS="-Wl,--as-needed -Wl,--thinlto-jobs=17"

There's something weird in the USE flags - it's showing -+clang. I've never seen both - and + symbols on a flag before. Since thinlto and optimize-thinlto are both enabled, I'm assuming clang is enabled?

I believe ccache should automatically bypass its cache if anything has changed (different compiler version, changed header files) so I'm not sure if that's the issue.

arbitrary-dev commented 1 year ago

I'm assuming clang is enabled?

Yes, enabled.

$ equery l gcc clang
 * Searching for gcc ...
[IP-] [  ] sys-devel/gcc-11.3.1_p20221209:11
[IP-] [  ] sys-devel/gcc-12.2.1_p20230121-r1:12

 * Searching for clang ...
[IP-] [  ] sys-devel/clang-15.0.7-r1:15/15g1

$ equery u -i ungoogled-chromium
[ Legend : U - final flag setting for installation]
[        : I - package is installed with flag     ]
[ Colors : set, unset                             ]
 * Found these USE flags for www-client/ungoogled-chromium-108.0.5359.124_p1:
 U I
 - - X                  : Add support for X11
 - - cfi                : Build with CFI (Control Flow Integrity) enabled. It
                          requires "-stdlib=libc++", see #40 for more details.
 + + clang              : Use Clang compiler instead of GCC
 - - convert-dict       : Patch and build the convert_dict utility. The script
                          will be installed into
                          /usr/lib64/chromium-browser/update-dicts.sh. More
                          info here: https://github.com/Eloston/ungoogled-
                          chromium/issues/188#issuecomment-444752907
 + + cups               : Add support for CUPS (Common Unix Printing System)
 - - custom-cflags      : Build with user-specified CFLAGS (unsupported)
 - - debug              : Enable DCHECK feature with severity configurable at
                          runtime. Mostly intended for debugging and
                          development, NOT RECOMMENDED for general use.
 - - enable-driver      : Build chromedriver
 - - gtk4               : Build with GTK4 headers.
 + + hangouts           : Enable support for Google Hangouts features such as
                          screen sharing
 - - headless           : Build Ozone only with headless backend, NOT
                          RECOMMENDED for general uses
 - - hevc               : Enable HEVC decoding support. Should work with
                          system-ffmpeg, but might require additional patching
                          for the built-in one.
 - - js-type-check      : Enable JavaScript type-checking for Chrome's web
                          technology-based UI. Requires Java.
 - - kerberos           : Add kerberos support
 + + official           : Enable Official build instead of Developer build.
 + + optimize-thinlto   : Whether to enable ThinLTO optimizations. Turning
                          ThinLTO optimizations on can substantially increase
                          link time and binary size, but they generally also
                          make binaries a fair bit faster.
 + + optimize-webui     : Optimize parts of Chromium's UI written with web
                          technologies (HTML/CSS/JS) for runtime performance
                          purposes. This does more work at compile time for
                          speed benefits at runtime.
 + + pgo                : Enable PGO. Requires clang and bundled binary profile
                          data in sources tree.
 - - pic                : Disable optimized assembly code that is not PIC
                          friendly
 + + proprietary-codecs : Enable proprietary codecs like H.264, MP3
 + + pulseaudio         : Add support for PulseAudio sound server
 + + qt5                : Add support for the Qt 5 application and UI framework
 - - screencast         : Enable support for remote desktop and screen cast
                          using media-video/pipewire
 - - suid               : Build the SUID sandbox, which is only needed on
                          CONFIG_USER_NS=n kernels
 + + system-av1         : Use the system media-libs/libaom and media-libs/dav1d
                          instead of the bundled ones
 + + system-ffmpeg      : Use the system media-video/ffmpeg instead of the
                          bundled one
 + + system-harfbuzz    : Use the system media-libs/harfbuzz instead of the
                          bundled one
 + + system-icu         : Use the system dev-libs/icu instead of the bundled
                          one
 + + system-jsoncpp     : Use the system dev-libs/jsoncpp instead of the
                          bundled one
 + + system-libevent    : Use the system dev-libs/libevent instead of the
                          bundled one
 + + system-libusb      : Use the system dev-libs/libusb instead of the bundled
                          one
 - - system-libvpx      : Use the system media-libs/libvpx instead of the
                          bundled one
 + + system-openh264    : Use the system media-libs/openh264 instead of the
                          bundled one. If disabled, it will restrict
                          USE=bindist.
 + + system-openjpeg    : Use the system-wide media-libs/openjpeg instead of
                          the bundled one. OpenJPEG use are exclusively for
                          Chromium's PDF viewer.
 + + system-png         : Use system libpng instead of the bundled one
 + + system-re2         : Use the system-wide dev-libs/re2 instead of the
                          bundled one
 + + system-snappy      : Use the system-wide app-arch/snappy instead of the
                          bundled one
 + + thinlto            : Build with ThinLTO support. LTO (Link Time
                          Optimization) achieves better runtime performance
                          through whole-program analysis and cross-module
                          optimization (highly recommended).
 + + vaapi              : Enable Video Acceleration API for hardware decoding
 - - vdpau              : Enable the Video Decode and Presentation API for Unix
                          acceleration interface
 + + wayland            : Enable dev-libs/wayland backend
 - - widevine           : Unsupported closed-source DRM capability (required by
                          Netflix VOD)
perfect7gentleman commented 1 year ago

@arbitrary-dev , please post

$ cat /etc/clang/gentoo-runtimes.cfg 
Ahrotahn commented 1 year ago

A system library change might be the culprit if rebuilds of older versions and electron have the same problem.

A few things to try to rule out some other issues:

arbitrary-dev commented 1 year ago
$ cat /etc/clang/gentoo-runtimes.cfg
# This file is initially generated by sys-devel/clang-runtime.
# It is used to control the default runtimes using by clang.

--rtlib=libgcc
--unwindlib=libgcc
--stdlib=libstdc++
-fuse-ld=bfd
EsmailELBoBDev2 commented 1 year ago

I'm unsure what to send

I'd guess, you better compile with debug use-flag and share the stacktrace.

Hey, using debug flag causes the build to fail and when I used -clang it worked image

arbitrary-dev commented 1 year ago

Hey, using debug flag causes the build to fail

Share the stacktrace you get in terminal.

preed commented 1 year ago

I, too, was experiencing this problem; I ~solved~ worked around it thusly: https://bugs.gentoo.org/892537#c3

thubble commented 1 year ago

I, too, was experiencing this problem; I ~solved~ worked around it thusly: https://bugs.gentoo.org/892537#c3

Looks like this was my issue as well. I commented out -fstack-clash-protection in /etc/clang/gentoo-hardened.cfg and 109.0.5414.119 is now working fine, with no extra workarounds and PartitionAlloc re-enabled.

Excellent find, thanks!

fordfrog commented 1 year ago

i have -fstack-clash-protection enabled in /etc/clang/gentoo-hardened.cfg and i don't have the issue. my chromium is configured to use proxy. here are my use flags: [ebuild R ] www-client/ungoogled-chromium-110.0.5481.77_p1::pf4public USE="X clang convert-dict cups hevc js-type-check official optimize-thinlto optimize-webui pgo proprietary-codecs pulseaudio qt5 system-av1 system-ffmpeg system-harfbuzz system-icu system-jsoncpp system-libevent system-libusb system-openh264 system-openjpeg system-png system-re2 system-snappy thinlto vaapi vdpau widevine -cfi -custom-cflags -debug -enable-driver -gtk4 -hangouts -headless -kerberos -pic -screencast (-selinux) -suid -system-libvpx -wayland" L10N="cs -af -am -ar -bg -bn -ca -da -de -el -en-GB -es -es-419 -et -fa -fi -fil -fr -gu -he -hi -hr -hu -id -it -ja -kn -ko -lt -lv -ml -mr -ms -nb -nl -pl -pt-BR -pt-PT -ro -ru -sk -sl -sr -sv -sw -ta -te -th -tr -uk -ur -vi -zh-CN -zh-TW" 0 KiB

PF4Public commented 1 year ago

www-client/ungoogled-chromium-110.0.5481.77

Does it work? Mine crashes just after start :)

fordfrog commented 1 year ago

it works for me fine. but i'm not sure why. maybe the code is not triggered as i use proxy, or i evaded the buggy code by using more system libs instead of the bundled ones.

PF4Public commented 1 year ago

it works for me fine

I'm asking because 110 has yet another issue with libstdc++. But if ebuild works for you, this should mean that one of my local patches is to blame for me.

fordfrog commented 1 year ago

it works for me fine

I'm asking because 110 has yet another issue with libstdc++. But if ebuild works for you, this should mean that one of my local patches is to blame for me.

yeah, it compiles and works fine for me, at least as fine as the version before. no custom patches applied here.

baconsalad commented 1 year ago

Removing -fstack-clash-protection fixed it for me with 110.0.5481.77_p1

Sneaky gentoo.

perfect7gentleman commented 1 year ago

My way. 1 - removed @gentoo-hardend.cfg line in gentoo-common.cfg 2 - rebuilt llvm toolchain 3 - rebuilt nodejs 4 - ... 5 - profit. Version 110.0.5481.77 (Official Build, ungoogled-chromium) (64-bit)

thesamesam commented 1 year ago

My way. 1 - removed @gentoo-hardend.cfg line in gentoo-common.cfg 2 - rebuilt llvm toolchain 3 - rebuilt nodejs 4 - ... 5 - profit. Version 110.0.5481.77 (Official Build, ungoogled-chromium) (64-bit)

This is essentially the same as saying that using GCC "fixes" the problem. The problematic line has already been identified above.

PF4Public commented 1 year ago

I commented out -fstack-clash-protection

I wonder why my systems never got that flag in hardened.cfg?

thubble commented 1 year ago

I commented out -fstack-clash-protection

I wonder why my systems never got that flag in hardened.cfg?

Do you recall if you ever did etc-update merging with that file? (It's part of sys-devel/clang-common and is config-protected).

PF4Public commented 1 year ago

Do you recall if you ever did etc-update merging with that file?

Nope. Maybe I just didn't update clang for long enough for this to never happen to me :)

joecool1029 commented 1 year ago

Are we tracking multiple issues here? The removal of -fstack-clash-protection seems to fix a different chromium crash that happens later than initial start. Maybe since electron is building with chromium 108 it doesn't help.

I still see the immediate crash/segfault on electron even after commenting that flag and rebuilding clang/llvm and then electron. It is not possible to build electron with gcc (at least not with gcc12).

thesamesam commented 1 year ago

I think that's possible and it'd explain the inconsistent results, backtraces, and observations in the thread. It's probably worth chalking this bug up to -fstack-clash-protection and forking that other issue into a new one.

perfect7gentleman commented 1 year ago

@joecool1029, try to rebuild nodejs too.

joecool1029 commented 1 year ago

Already did as suggested, no change. Pretty sure we're looking at chromium fixing part of the problem in later versions combined with also needing this flag removed.

thubble commented 1 year ago

I re-built qtwebengine (latest 5.15 git, Chromium 87-based with security backports) and re-enabled my clang/thinlto and -fomit-frame-pointer hacks, and it's working perfectly. So I'm convinced that the Gentoo clang-hardening changes were the source of all of my issues.

It was really frustrating that this happened at exactly the time I upgraded to a completely new architecture, and was also experimenting with overclocking/undervolting - so thanks everyone for the help getting this sorted out!

I don't build Electron from source, and I always build nodejs with gcc, so unfortunately I'm not sure what the problems are there.

PF4Public commented 1 year ago

@joecool1029 Do you use chromium on your system? Does it compile and work? Do you remember if you did build 108 chromium, did it work? If electron crashes for you, so should chromium.

joecool1029 commented 1 year ago

No, I have edge as a backup but I use firefox as my browser. I think next attempt I'll drop all the system useflags in case dependencies were built with bad clang flags. Electron forces some of the GN flags so I can't test as many configs with it as plain chromium (like changing allocator).

Techwolf commented 1 year ago

I am getting the same thing here. Currently doing a full debug build now, that will take a few hours. Will post results here later. Due to using packages, I have a local binary fallback of the previous version to use when I need too. I have mostly switch over to librewolf now.

Techwolf commented 1 year ago

An update.

Building a debug build with clang took 6+ hours. Sadly, can not load the debug build in gdb due to requiring over 32G of RAM.

Building a debug build with gcc also required over 32G of RAM. I have only 32G of RAM.

Building a normal build with USE="-clang" took 3+ hours and fixed the crashing problem for me.

thubble commented 1 year ago

I am getting the same thing here. Currently doing a full debug build now, that will take a few hours. Will post results here later. Due to using packages, I have a local binary fallback of the previous version to use when I need too. I have mostly switch over to librewolf now.

@Techwolf Can you post your /etc/clang/*.cfg (most importantly, /etc/clang/gentoo-hardened.cfg)? Having -fstack-clash-protection anywhere in there seems to be what caused this issue.

PF4Public commented 1 year ago

I'll close this issue since the culprit was identified.

@joecool1029 feel free to open separate issue regarding electron if you're willing to investigate it further.

joecool1029 commented 1 year ago

Sounds good, I'll start to look into the electron issue some more this week.