dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.25k stars 4.73k forks source link

OSX PR build artifacts are missing symbols #99456

Open jkotas opened 7 months ago

jkotas commented 7 months ago

Repro

  1. Download CoreCLRProduct__osx_x64 artifacts from PR build (e.g. https://dev.azure.com/dnceng-public/_apis/resources/Containers/40000142/CoreCLRProduct__osx_x64_release?itemPath=CoreCLRProduct__osx_x64_release%2FCoreCLRProduct__osx_x64_release.tar.gz)

Result

No .dwarf or .dbg files present in the archive.

This makes is impossible to debug OSX crash dumps created by PR test runs.

dotnet-policy-service[bot] commented 7 months ago

Tagging subscribers to this area: @hoyosjs See info in area-owners.md if you want to be subscribed.

jkotas commented 7 months ago

@BruceForstall Are you aware of this issue?

BruceForstall commented 7 months ago

@BruceForstall Are you aware of this issue?

I'm wasn't.

It looks like this behavior was introduced by https://github.com/dotnet/runtime/pull/81387 (cc @kunalspathak @agocke ).

The build job, but only for PR builds, sets a flag to skip the "strip_symbols" step:

https://github.com/dotnet/runtime/blob/40ac297a1101ac293a1bae01f2e2e1556f7ec017/eng/pipelines/coreclr/templates/build-job.yml#L101-L105

this ends up setting CLR_CMAKE_KEEP_NATIVE_SYMBOLS=true which causes symbol stripping to be skipped (eng/native/functions.cmake).

It appears this is ok for Linux, which apparently collect their debug symbols in the built executable. It is not ok for Mac, where the built executable only contains a link to the object files that contain the debug symbols. (The --keepsymbols option was specifically introduced to support Linux: https://github.com/dotnet/runtime/pull/39203).

So, on Mac, we need to always "strip" symbols. "strip" is a bit of a misnomer in this case: it's more like "collect" symbols, to put them into a single .dwarf file (or, preferably, a .dSYM bundle).

A simple fix would be to not do pass --keepsymbols to the build-runtime.sh script for Mac

I presume the stack trace stuff has never worked on Mac?

Related: https://github.com/dotnet/runtime/issues/92911

hoyosjs commented 7 months ago

We haven't notices because macOS queues have largely been unable to collect dumps until recently and even after that, the queues in Helix don't have the symbolizer. https://github.com/dotnet/arcade/issues/11631

janvorli commented 7 months ago

the queues in Helix don't have the symbolizer

Instead of symbolizer, macOS has atos tool. An old note from my personal onenote has an example:

On macOS:
atos -o artifacts/bin/coreclr/OSX.arm64.Debug/libcoreclr.dylib.dwarf 0x7ac654
EEStartupHelper() (in libcoreclr.dylib.dwarf) (ceemain.cpp:1001)
(use the dwarf file to get the source line)

Or 
atos -o artifacts/bin/coreclr/OSX.arm64.Debug/libcoreclr.dylib.dwarf 0x7ac654 -fullPath
EEStartupHelper() (in libcoreclr.dylib.dwarf) (/Users/janvorli/git/runtime/src/coreclr/vm/ceemain.cpp:1001)
hoyosjs commented 7 months ago

They claim it's installed. I'd have to see if it works. Currently the infrastructure is such that it parses the output of that tool. Given this issue I expect it not to work since we don't have any logic to locate symbols. In the case of atos, it may be dsym-bundling aware which would help, but it's likely work. cc @JulieLeeMSFT