Open bsclifton opened 5 months ago
When I pulled up one of the crashes, here is what I saw: https://brave.sp.backtrace.io/p/brave/debug?filters=JTVCJTVCJTIyX3J4aWQlMjIlMkMlMjJlcXVhbCUyMiUyQyUyMmU1MGExNDAwLWMwN2ItMGQwYy0wMDAwLTAwMDAwMDAwMDAwMCUyMiU1RCU1RA%3D%3D&fingerprint=299d76a1eefda5bc7a234fd2563e6c6891eb7e572b698dbf6f86bf8ca3850dcc&debug=(%227849be4%22,0,0)
[ 00 ] ne_filter_stats_toggle
[ 01 ] ne_filter_protocol_remove_input_handler
[ 02 ] nw_protocol_boringssl_remove_input_handler
[ 03 ] CFURLProtectionSpaceGetServerTrust
[ 04 ] nw_endpoint_flow_failed
[ 05 ] _dispatch_call_block_and_release
[ 06 ] _dispatch_client_callout
[ 07 ] _dispatch_workloop_invoke
[ 08 ] _dispatch_workloop_worker_thread
[ 09 ] _pthread_wqthread
[ 10 ] start_wqthread
[ 11 ] 0x70000652eb70
[ 12 ] start_wqthread
[ 13 ] base::allocator::dispatcher::internal::DispatcherImpl<base::PoissonAllocationSampler>::AllocZeroInitializedFn(allocator_shim::AllocatorDispatch const*, unsigned long, unsigned long, void*) ( dispatcher_internal.h:153 )
[ 14 ] base::allocator::dispatcher::internal::DispatcherImpl<base::PoissonAllocationSampler>::AllocFn(allocator_shim::AllocatorDispatch const*, unsigned long, void*) ( dispatcher_internal.h:131 )
[ 15 ] base::allocator::dispatcher::internal::DispatcherImpl<base::PoissonAllocationSampler>::AllocFn(allocator_shim::AllocatorDispatch const*, unsigned long, void*) ( dispatcher_internal.h:131 )
[ 16 ] malloc_zone_calloc
[ 17 ] calloc
[ 18 ] _dispatch_kq_poll
[ 19 ] ShimMalloc ( shim_alloc_functions.h:107 )
[ 20 ] allocator_shim::(anonymous namespace)::MallocZoneMalloc(_malloc_zone_t*, unsigned long) ( allocator_shim_override_apple_default_zone.h:145 )
...more in backtrace...
cc: @iefremov
it looks like a problem with installation... CFURLProtectionSpaceGetServerTrust
looks like Macos is killing the browser
have they tried reinstalling? How many complains do we have?
@bsclifton
I suspect the crash is connected with https://github.com/brave/brave-browser/issues/29406.
NetworkExtension
and CFNetwork
os libs.
2.CFURLProtectionSpaceGetServerTrust
is a low-level representation of https://developer.apple.com/documentation/foundation/urlprotectionspace/1409926-servertrust. This API check the validity of SSL connection.NetworkExtension.h
is used in ikev2_connection_api_impl_mac.mm
. It also specify probeUrl that is passed to API and will be reached by the os lib.In conclusion, I looks like macOS 10.15.7 bug that is triggered by the VPN-on-demand feature. some browser change (probably cr126 update).
It's not a browser fault, but we have to ship the workaround, because the crash rate is unaffordable.
The easy way is to disable the feature for <=10.15.7 macOS.
UPD: VPN-on-demand were shipped to v1.65.x, but we don't have a single crash from it. So it's definitely not the reason.
we have a pretty lot of crashes in a last month https://share.backtrace.io/api/share/I2CMpTx3MkYQx1VLL4Q1HvU2
Also only Intel&) mac devices are affected.
The first crash happened on https://github.com/brave/brave-browser/releases/v1.67.70, a day after cr125 is merged. There is an old report https://issues.chromium.org/issues/40834734, that mentioned Brave, 3rd-party firewalls (LuLu) and browser update check (from about page). The update check is also a candidate to be the request triggered the issue.
The MacOS bug is probably https://nvd.nist.gov/vuln/detail/CVE-2020-9996, fixed in macOS Big Sur 11.0.1
I've checked a few raw crash dumps. Most of them are referred to the updater (Sparkle) @mherrmann have we changed anything in the mac updater in v1.65.x?
1: 2: 3:
@atuchin-m we have not touched Sparkle in a long time.
@atuchin-m @iefremov ~I believe Chromium 125 and 126 had changes where~ Chromium 122 had deleted the keystone implementation. We had to keep this - so we had pulled those patches in. It's possible there's a problem with how that code that we kept gets called.
@cdesouza-chromium and @emerick (and maybe @mkarolin) may know more
Here's an example commit (from a more recent Chromium upgrade): https://github.com/brave/brave-core/pull/23233/commits/632107206cba2471818604c3b576bfbe019aa713 (from https://github.com/brave/brave-core/pull/23233)
Thanks @bsclifton I suppose it's https://github.com/brave/brave-browser/issues/35893
The Keystone implementation was removed from upstream when we upgraded to cr122, so we pulled it into our code base at that time pretty much as-is. We did some subsequent follow-up work in https://github.com/brave/brave-browser/issues/35893 to remove check_includes = false
when building those files. Neither of these were intended to change any functionality (they just moved files around, really), though anything is possible.
In cr125, upstream migrated the infobar delegate from Objective-C to C++, which is what https://github.com/brave/brave-core/commit/632107206cba2471818604c3b576bfbe019aa713 is about. We kept it as Objective-C to avoid any risky changes to the upgrade flow and since we're only using the Keystone-related code to hook into Sparkle.
I looked through the commits, but nothing is really leaping out at me as far as causing a crash.
Thanks, @emerick! My bad, it was cr122 😄 Updated above comment
from @atuchin-m:
In conclusion, I looks like macOS 10.15.7 bug that is triggered by some browser change (probably cr126 update). It's not a browser fault, but we have to ship the workaround, because the crash rate is unaffordable.
I think you narrowed it down great. Basically, we see the same report from users on Catalina: https://community.brave.com/t/brave-keeps-crashing-and-is-getting-worse/557686
I don't think Chrome/Chromium currently has any restriction in place currently for Catalina. But Chromium 129 (~September 17th) will be dropping support for Catalina officially.
Notice the original report mentions visiting a specific website (amazon.de) causing a crash. @atuchin-m can there be specific SSL/TLS properties that the boringssl client is crashing when parsing for the site? I'll try to ask for more information.
Description
User reports:
Another user reports (not sure if related):
Thread on community: https://community.brave.com/t/crashes-with-latest-update/549560/7
Crash reports
21 May 2024
03431700-ce01-fb0b-0000-000000000000
27 May 2024
68091200-293f-040c-0000-000000000000
a27b1000-293f-040c-0000-000000000000
3 June 2024
e50a1400-c07b-0d0c-0000-000000000000
906c1200-c07b-0d0c-0000-000000000000
c1fc1100-c07b-0d0c-0000-000000000000
10 June 2024
46fe1000-f3b3-160c-0000-000000000000
747e1500-f3b3-160c-0000-000000000000
13 June 2024
051b1900-f3b3-160c-0000-000000000000
19c80200-37ee-1f0c-0000-000000000000
813f0400-37ee-1f0c-0000-000000000000
17 June 2024
7fa51000-37ee-1f0c-0000-000000000000
c32f0c00-37ee-1f0c-0000-000000000000
c69a0b00-37ee-1f0c-0000-000000000000