dotnet / sdk

Core functionality needed to create .NET Core projects, that is shared between Visual Studio and CLI
https://dot.net/core
MIT License
2.66k stars 1.06k forks source link

Discussion: Change default DebugType to embedded for .NET Core 3 #2679

Open clairernovotny opened 5 years ago

clairernovotny commented 5 years ago

Inspired from this issue and twitter thread https://github.com/dotnet/docs/pull/9110 https://twitter.com/rrelyea/status/1064587440935956482

I want to propose that the default DebugType in the SDK change from portable to embedded. I would also propose that Sourcelink be included/used by default.

At the very least, there should be one property ConfigureOpenSourceDefaults that turns these on as best practices.

My rationale is driven by the pit of success. I believe that out-of-the-box, developers should produce fully debuggable code. This is similar to how JS typically generates/includes sourcemaps by default, so there is precedent.

Few notes:

Some things like Appx/MSIX packaging strip-out pdb files by default. They rely on extra properties (AppxPackageIncludePrivateSymbols) to include them, which I'm sure 99% of people wouldn't know to add. Without them, the stack traces aren't as useful.

Symbol servers may be blocked by corporate firewalls, or otherwise affected by low-bandwidth connections (think airplanes, trains, mobile hotspots, and anywhere else with slower/spotty connections).

I believe that the extra size of the files with embedded pdb's is worth the trade-off for most cases as it presents the best debugging experience without any extra configuration. It just works.

Thanks

NickCraver commented 5 years ago

I can confirm the slow symbol server load is super painful, I have a related dev community issue open on it. If someone is on latest previews which means fewer cache hits, it's even worse. I'm not saying optimize for this, only that the case exists and we hit it often.

nguerrera commented 5 years ago

@tmat

nguerrera commented 5 years ago

IMHO, we should not change the default. It's basically breaking to add 20-30% size. This is not just on disk, but can also impact (virtual) memory usage.

The "very least" option is something we can consider. But I'm not sure about how it would pull in source link. How would we know which provider to use? And even if we did, how would we determine the package version?

clairernovotny commented 5 years ago

For sourcelink, you can add all of them and it'll use the right one. In fact it's the recommended approach for dealing with multiple hosts. Sourcelink could be bundled in the SDK like NuGet, and the others.

I think the idea of an opt-in ConfigureOpenSourceDefaults property gives the SDK a chance to provide actionable best practices that people don't have to think too hard about. That doesn't change the default, but makes it really easy for developers to follow the guidance for the scenario.

NickCraver commented 5 years ago

I want to add another angle here: I maintain an exceptions logging library, we deal with exceptions, and we get issues reported from exceptions. Stack traces are golden. They're incredibly useful. I strongly believe default experience for any app should have stack traces include the file paths and line numbers. For this to happen the deployment platform must be able to read the symbols, and they have to be there. With SourceLink, it even means the exceptions have the git hash in the URL path, pointing directly to the exact code throwing. That's awesome. But only if we get it. Without users having symbols in the first place it's all wasted effort.

That rich experience should be the default. If size is a concern, I propose we focus on stripping what users don't want in a publish phase rather than having options to the detriment of many up front. If someone is concerned about mobile or container sizes, what are the chance we could allow a slimming phase to happen as Xamarin can today?

I don't see how we can make deployment choices for everyone up at the package level. That's a deployment concern and a deployment preference. Why is someone's small footprint deployment priority governing by debugging and error logging experience? The only way to work around this as a package author (as far as I'm aware) is to force the PDB files in the package or to switch to embedded symbols. But both only fulfill me using my own libraries and take the control out of everyone who's trying to do that small deployment and is okay with sacrificing that functionality. For this reason, I really suggest focus be on splitting deployment concerns into another phase than "how packages are hosted".

There's a related issue up there where embedded seems like the best long-term approach, but last we tested .NET Full Framework wouldn't read the symbols and render the data in stack traces. That's something that needs fixing if not already done in the .NET 4.x line.

nguerrera commented 5 years ago

For sourcelink, you can add all of them and it'll use the right one.

How much overhead does that add to build for the ones I'm not using.

In fact it's the recommended approach for dealing with multiple hosts.

Where is this recommended? I can see discussion of supporting mixed submodules, but nothing that says just ref them all.

MeikTranel commented 5 years ago

I would vote for external PDB-Files by default with sourcelink as an optional feature. NuPKGs should contain PDB-Files that can be filtered with the known asset metadata in packagereferences.

<ItemGroup>
    <PackageReference Include="Contoso.Utility.UsefulStuff" Version="3.6.0">
        <ExcludeAssets>Symbols</ExcludeAssets>
    </PackageReference>
</ItemGroup>

Maybe it'd be possible to make it Debug/Release Config dependent whether or not symbols are included.

MeikTranel commented 5 years ago

@nguerrera

For sourcelink, you can add all of them and it'll use the right one.

How much overhead does that add to build for the ones I'm not using.

Not much really. Sourcelink tries to retrieve git remote origin and puts that url through a parser and sees what fits. Getting the current head is also not much effort.

nguerrera commented 5 years ago

In addition to the concern around size increase being essentially breaking, anything that discloses new source information in binary is absolutely breaking.

nguerrera commented 5 years ago

I mean breaking if defaults change.

clairernovotny commented 5 years ago

Opt-in via a single property then, solves the defaults issue.

nguerrera commented 5 years ago

I would vote for external PDB-Files by default with sourcelink as an optional feature

This is status quo for those, yes?

MeikTranel commented 5 years ago

Opt-in via a property then, solves the defaults issue.

Please don't call it "ConfigureOpenSourceDefaults", though. Employers of the world will lose their collective s****, if i start "enabling open source" just to link PDB with private repositories.

clairernovotny commented 5 years ago

This is status quo for those, yes?

Yes, and that's what I'm trying to change, at least with a single property to opt-in to a better stance for OSS libraries.

clairernovotny commented 5 years ago

Please don't call it "ConfigureOpenSourceDefaults", though. Employers of the world will lose their collective s****, if i start "enabling open source" just to link PDB with private repositories.

I'm sure we can come up with a name. I think the key point here is that there are different security needs for closed source vs open source projects. The SDK can have different default guidance and behavior for each based on a single configuration property.

MeikTranel commented 5 years ago

Symbol server significantly slows down VS debug startup as it probes for files

So are we talking just the fact that the PDB-File isn't inside the assembly file instead in a seperate file in the same directory is enough for it to significantly slow down debug startup? Or the process of retrieving PDB files from symbol server feeds?

I feel like this is vital to the discussion and the original post wasn't clear enough about this for me to understand.

JamesNK commented 5 years ago

My hot takes:

Source Link on by default: No. We are looking at this from the perspective of developers who build open source code. Meanwhile the majority of .NET apps are close source.

PDBs embedded in DLL: I like embedding the PDB in the NuGet package alongside the DLL more. It gives the developer the option to not deploy PDBs when publishing then bin directory.

I think the defaults today are fine. There are improvements I'd like to see:

MeikTranel commented 5 years ago

@JamesNK what would be the benefits of having sourcelink in the SDK, if it isn't on by default?

Why put stuff inside Microsoft.Net.Sdk, if this forces you to put Sourcelink under .NET Sdk release cycle, when you gain nothing from it usability-wise?

EDIT: Even worse: You put a formerly nuget restored msbuild sdk, that can referenced at will into an Sdk, which has to be installed manually... Scenario bound to happen: "I want to use sourcelink with fancy new hosted git provider, but it's bundled into the Microsoft.Net.Sdk now, so i now have to wait until Appveyor has .NET Sdk 3.3.700 images..."

JamesNK commented 5 years ago

I mean that configuration to enable Source Link easily with a property would be in the SDK.

Disclaimer: I'm not an SDK expert.

clairernovotny commented 5 years ago

So are we talking just the fact that the PDB-File isn't inside the assembly file instead in a seperate file in the same directory is enough for it to significantly slow down debug startup? Or the process of retrieving PDB files from symbol server feeds?

The process of retrieving PDB files from symbol server is very slow today. I'm not aware of any noticeable debugger related perf issues between embedded pdbs and pdbs next to the file.

clairernovotny commented 5 years ago

@JamesNK Source Link on by default: No. We are looking at this from the perspective of developers who build open source code. Meanwhile the majority of .NET apps are close source.

I guess I'm aiming to introduce a notion of "actionable best practices" that are easy to enable. Different audiences will have different profiles, but I could imagine that there's one set of settings that are oriented around OSS and another that's for internal.

MeikTranel commented 5 years ago

Well there's two main ways of distributing Sdk packages currently:

For sourcelink to work you need the sourcelink msbuild task binaries, which do the job of resolving current head, remote urls and the matching raw source file endpoints. Each vcs provider has its own implementation of resolving origin urls and sourceroots into raw download urls. If you were to bundle this into Microsoft.Net.Sdk (which is an Sdk that will not get restored by the NugetSdkResolver) this means that any change or update to the package has to be distributed by an Sdk that has months between major releases. There's technically nothing wrong with doing so, it's just less flexible.

(Correct me if i'm wrong about stuff, Sdk overloads 😄 )

clairernovotny commented 5 years ago

@MeikTranel depending on the mechanism used, it could be possible to override the SDK's version of the tools with an updated version. Also, the SDK does seem to ship updates on a regular cadence, so it'd get picked up that way too.

MeikTranel commented 5 years ago

@onovotny yeah global.json is your friend here, but pinning versions for Microsoft.Net.Sdk doesn't magically install it. for anyone that doesn't use hosted build agents this means that any requirement to the sdk has to revolve around asking Ops teams to update build images everytime a useful tweak to the sdk comes along. NuGet sdk resolution is 100x more flexible than this.

vancem commented 5 years ago

It would be ironic if just when we can provide 2rd party nuget source server support we decide to turn it off by default. However changing the default so that you use snupkg (rather than symbols.nupkg (you seem to need to set the variable today).

    <SymbolPackageFormat>snupkg</SymbolPackageFormat>

We should at least document the experience as it is today, as well as how we want the default to be, so that they can be compared properly before making a decision.

Finally I note that the issue with delays in visual Studio (or any other debugger), being slow looking up symbols from symbol servers is really a bug in the debugger. Visual studio definally caches the result and is fast (does not hit the network, but can be slow if there are DLL that it COULDN'T find. It really should cache this negative result return quickly in that case as well. Anyway, we should not let bugs in other components drive the design/defaults of roughly unrelated things. We shoudl instead log an issue to get things fixed properly.

livarcocc commented 5 years ago

cc @rrelyea

vchirikov commented 5 years ago

Embedded pdb doesn't show full callstack info in net framework < 4.7.2, it's a only a problem. For netcore I think we can use embedded as default. It's really nice proposal.

aidapsibr commented 5 years ago

IMO new templates on v3 should include the msbuild property to enable all the ins-and-outs of this feature. v2 projects could add the property to opt-in to the behavior without an upgrade.

Suchiman commented 5 years ago

Since every nuget package can use its own setup of debugging configurations (so we'll have embedded pdbs from package a, portable ones from b and none from c) and the primary concern i see here is memory utilization and deployment size, maybe we should add a strip task that may run as part of the Release configuration to remove debug symbols. That way we're free to deploy the symbols however we like without affecting deployment size.

tmat commented 5 years ago

Good discussion. Quite a few points and questions raised here.

First, I’d like to point out that many of these topics are discussed in this document. Some information there is a bit out of date, I’ll update it soon to reflect recent developments.

Let me try to group and answer the questions:

  1. Changing the defaults for OSS projects in the .NET Core SDK

    I want to propose that the default DebugType in the SDK change from portable to embedded.

We can’t change the default. As @nguerrera mentioned this would be a breaking change. I don't think we should change the default for v3 projects either. That said …

I think the idea of an opt-in ConfigureOpenSourceDefaults property gives the SDK a chance to provide actionable best practices that people don't have to think too hard about. That doesn't change the default, but makes it really easy for developers to follow the guidance for the scenario.

How about creating a package, something like “ProjectBuildPolicies.OpenSource.nupkg” that does what you’re asking for – it sets various properties to values that we deem to be good for OSS projects. No need for setting any property – just referencing the package from a project would opt into OSS settings.

  1. Differences between PDBs embedded to DLLs, Portable PDBs included in NuGet packages, Portable PDBs on symbol server, reasoning and guidelines

The difference is in size of DLLs and packages. For some customers and in some scenarios the package size matters, for others it does not. Sometimes the DLL size matters, sometimes it does not.

Why might package size matter? Right now package restore for dotnet/roslyn repository downloads 357 packages of total size 520MB and these take 2.6 GB in NuGet cache. Increasing that means that every restore on a clean machine is slower and needs to move more data around. Why make the restore slower and take more space on disk by downloading symbols for all packages when they will not likely be used? Why should a developer wait longer for restore when they might not even need to debug thru 3rd party code?

Similarly to DLL size. For some trivial utilities the extra size of the .exe/dll doesn’t matter. On the other hand, consider Visual Studio. It may load hundreds of assemblies to devenv.exe process. Since this is a 32-bit process the address space is tight. We can’t afford increasing the size of loaded DLLs unnecessarily. Therefore Embedded PDBs are not an option for anything that ships in VS.

We designed the NuGet symbol acquisition to be pay-for-play. Many customers don’t ever want to step to 3rd party library. There is even a VS debugger option “Just My Code” specifically designed for these customers. The symbols should be available only when they are needed. There are some gaps and inefficiencies in the symbol acquisition tooling (see [5] below). These issues can and should be fixed in such a way that does not regress other scenarios.

As for the guidelines - I don’t think there is a silver bullet approach that works for all cases. It really depends on what the DLLs and PDBs are used for. I’d suggest:

to use Embedded PDBs (DebugType = embedded)

to use Portable PDBs packaged in .snupkg

stop adding Portable PDBs to .nuget packages

  1. Enable Source Link by default, or easier to enable

I would also propose that Sourcelink be included/used by default.

For sourcelink, you can add all of them and it'll use the right one. In fact it's the recommended approach for dealing with multiple hosts. Sourcelink could be bundled in the SDK like NuGet, and the others.

Unfortunately, this is not possible. In general the user needs to specify what source control provider is hosting the repository. It does so by selecting the right Source Link package (Microsoft.SourceLink.GitHub, Microsoft.SourceLink.GitLab, etc.) For some providers, like GitHub.com or Azure DevOps, it is possible to infer the provider from the remote origin URL since these have well-known domains. However, for on-prem hosted providers (e.g. GitHub Enterprise, GitLab, TFS, etc.) there is no way Source Link can determine the provider from the domain. In these cases a Source Link package referenced by the project checks whether it is the only Source Link package referenced by the project. If so it assumes that the domain belongs to the source control provider it supports. If multiple Source Link packages are referenced then the project needs to specify domains that should be handled by each of them. Besides that, some providers always require a bit of configuration – e.g. projects in TFS need to specify server virtual directory as it is impossible to infer from the URL.

There are improvements I'd like to see: Source Link being easier. Off by default but easy to enable.

What specific improvements do you want to see? Source Link packages are explicitly designed to be as simple to use as possible.

For the most common scenario enabling Source Link is done by referencing the package corresponding to the source control provider that hosts the repository, with no other configuration needed: https://github.com/dotnet/sourcelink#using-sourcelink.

  1. Including/excluding symbols when deploying apps

That's a deployment concern and a deployment preference. Why is someone's small footprint deployment priority governing by debugging and error logging experience?

maybe we should add a strip task that may run as part of the Release configuration to remove debug symbols

Some things like Appx/MSIX packaging strip-out pdb files by default. They rely on extra properties (AppxPackageIncludePrivateSymbols) to include them,

Sounds like some tools already strip them. I agree that the developer should have the option to include/exclude all symbols when deploying their app. Maybe what is missing is a unified approach (setting) that works across all tools that produce deployment artifacts (Appx, MSIX, .NET Native, IL Linker, etc.)

Some .NET tools, like .NET Native do not support symbol server; the symbols need to be alongside the dll

Seems like something that can and should be fixed. I’d recommend filing a separate issue.

  1. Symbol acquisition

Symbol server significantly slows down VS debug startup as it probes for files

Definitely agree with @vancem that this a bug that needs to be fixed and there is no reason to change the design of PDB delivery via NuGet.

Symbol servers may be blocked by corporate firewalls, or otherwise affected by low-bandwidth connections (think airplanes, trains, mobile hotspots, and anywhere else with slower/spotty connections).

Is the firewall blocking symbol servers an issue for symbols.nuget.org? If a firewall is blocking nuget.org then you won’t be able to get packages in the first place.

I agree that symbol acquisition can be improved to accommodate the offline scenario. Today you can run nuget restore to fetch all packages from the server and then work offline. I think we need a similar tool to acquire all symbols you’ll need to debug your project and its dependencies. An msbuild task or a global tool that determines the dependencies and downloads their symbols for specified platform/runtime. Similarly it can take a memory dump file and fetch symbols for all binaries in the dump. Similarly to nuget restore, which uses nuget cache, the tool would populate the local symbol cache debuggers uses to avoid downloading symbols multiple times. This would thus not require any changes in debuggers. They would just consult the cache as they do already. Although, perhaps the debuggers (or even VS and nuget?) might need an easy to use offline mode switch to avoid attempting to download symbols that are not in the cache.

  1. Support for Portable PDBs in .NET Framework runtime stack traces

.NET Full Framework wouldn't read the symbols and render the data in stack traces. That's something that needs fixing if not already done in the .NET 4.x line.

Portable PDBs (and embedded PDBs) work on 4.7.2. They are not enabled in stack traces by default due to back-compat concerns. A configuration switch is needed.

roji commented 5 years ago

Note previous similar discussion in https://github.com/aspnet/Universe/issues/131.

poke commented 5 years ago

As a followup to aspnet/EntityFrameworkCore#15316:

It sounds like what we then would need would be a way to publish an application while including symbols for all packaged assemblies. Like some dotnet publish --include-symbols that would – if necessary – load the symbols from the symbol server so that they can be distributed with the application.

/cc @SimonCropp

tmat commented 5 years ago

/cc @nguerrera @livarcocc @rrelyea @anangaur

livarcocc commented 5 years ago

@tmat it is not clear to me at this point what exactly this bug is tracking... Can you help clarify that?

tmat commented 5 years ago

I think we should create a separate issue for https://github.com/dotnet/sdk/issues/2679#issuecomment-485137716 and close this one.

SimonCropp commented 4 years ago

given netcore3 has sailed. should this be renamed? or closed?

SimonCropp commented 4 years ago

@nguerrera can u point tot the doco that explains how embedded symbols impact impact virtual memory usage? my understanding was the runtime was smart enough to ignore that part of the assembly until it was needed?

https://github.com/dotnet/sdk/issues/2679#issuecomment-440030138

nietras commented 2 years ago

I have a bit of a small somewhat outside of scope question here regarding embedded pdbs. I like the benefits of embedding the pdb, but in CI (e.g. Azure Pipeline PublishSymbols@2 task) we are also using source indexing which requires (as far as I can tell) the pdb files to be available as separate files, which they are not when embedding, how do you solve this then?

KalleOlaviNiemitalo commented 2 years ago

Isn't source indexing (with pdbstr.exe from Debugging Tools for Windows) of managed code pretty much obsoleted by Source Link?

nietras commented 2 years ago

@kalleolaviniemitalo that may very well be. Source link requires coupling your project to a specific package for a specific cloud provider though, doesn't it? And we also publish indexing to a network drive.

RehanSaeed commented 2 years ago

DebugType=embedded vs .snupkg

The NuGet docs promote the use of .snupkg symbol packages because quote:

Symbol packages (*.snupkg) provide developers a good on-demand debugging experience without bloating the main package size and impacting restore performance for those who don't intend to debug the NuGet package.

However, they also suggest VS needs updating with the correct symbol server URL. Why can't VS be preconfigured to do this for us? I'd hate to add 30% to the dotnet restore time for every developer if .snupkg files can be made to work equally well. I'd like to get a clearer answer on why .snupkg's are problematic.

Better Defaults

DotNet.ReproducibleBuilds is a fairly new (not very well publicised yet) package that sets several defaults for NuGet packages, including source link and DebugType=embedded. I think it does a good job of making this easier, although I'm not sure embedded is the right default to go with yet.

Another approach might be to introduce a new SDK that does all of the above and a new project template dotnet new nuget. It might be nice if this SDK also return errors if the correct package metadata was not setup correctly, making it easier for devs to create NuGet packages without having to resort to looking up NuGet documentation.

<Project Sdk="Microsoft.NET.Sdk.NuGet">
</Project>
alefranz commented 2 years ago

Given symbols package are not really a thing outside of nuget.org, as most (all?) the private feeds don't support them, I think this should be reconsidered again.

While it has minor disadvantages to use embedded symbols for public packages on nuget.org (I believe there use to be a .NET design document expanding on this downsides but can't find it anymore), I believe changing the default would not have a major impact as proper documentation can help to clarify the recommendation which I am sure will be followed by the popular packages which are the ones that could have impact in the ecosystem.

When using private feeds it is currently not easy to understand how to properly setup packages to have debug symbols as well as to get SourceLink to fully work. There has been major development by the VS team for SourceLink recently (navigate to definition yay!) but I think what's missing to make it really mainstream is the ease of setup - it should "just work" out of the box. Focusing on the current issue, we should at least make debugging dependencies work out of the box. .NET used to have a reputation to have a brilliant debugging experience, but now when splitting out systems in multiple packages, it is no longer so unless you go through the "pain" of figuring out how to set it up.

Since .NET 3.0 it is no longer possible to include pdbs inside the main nuget package (see #1458 from 2018) and while there has been some traction recently on the NuGet side, it is unclear if it will ever been addressed. So the only viable solution when using private feeds is embedded which works but, while it is easier to discover than the previous method, it still require somebody to actively look for it, which you may not if you are not aware that debugging dependencies is actually a thing in .NET (🤯, I know!).

tl;dr; Method NuGet.org Private Feed
Symbols package 🟢
Embedded 🟢 with some downsides 🟢

we should have a default that is "good enough" for all scenarios

nx-s commented 5 months ago

I know, the discussion is already pretty old but as it is still open and as I was wondering about it .. :)

I don't see a benefit for changing the DebugType from "portable" to "embedded" in any case.

Symbols and sources are only needed with diagnostics tools, like Debuggers or for stack traces on crashes . Means, they are required only for a specific purpose in case of a concrete "Issues". For most of the time that an application is working, they are not needed and for this reason, symbols should not be embedded at all.

So, all in all, I only see drawbacks by embedding symbols.

slang25 commented 5 months ago

Let's say you have a web server running in production, you do want symbols here. It used to be a common myth that you didn't want symbols for production, but this was mostly confusion.

While you typically don't want to output the full stacktrace to users, this is a different thing from if you want to ship symbols. Symbols will give you better stacktraces, which you'll be collecting privately.

Let's say you don't want symbols in for example a command line tool you are distributing, this is a publishing concern. You'd want to say "strip symbols please", and remember, symbols can be embedded by other library authors, so how else would you opt-out?

Embedding symbols reduce friction at development time, and for enterprises make things easier as they don't need to think about "does my privately hosted commercial nuget server support snupkgs", everything will "just work".

Separating these into 2 packages just makes the acquisition lazy, which I get can save bandwidth over time, but is a really odd tradeoff IMO.

Overall, there's no technical merit in using snupkgs over embedded symbols.

SimonCropp commented 5 months ago

Let's say you don't want symbols in for example a command line tool you are distributing, this is a publishing concern. You'd want to say "strip symbols please", and remember, symbols can be embedded by other library authors, so how else would you opt-out?

side note: u can opt in to trimming symbols if you really dont want them https://learn.microsoft.com/en-us/dotnet/core/deploying/trimming/trimming-options?pivots=dotnet-8-0#remove-symbols

nx-s commented 5 months ago

Let's say you have a web server running in production, you do want symbols here. It used to be a common myth that you didn't want symbols for production, but this was mostly confusion.

While you typically don't want to output the full stacktrace to users, this is a different thing from if you want to ship symbols. Symbols will give you better stacktraces, which you'll be collecting privately.

Let's say you don't want symbols in for example a command line tool you are distributing, this is a publishing concern. You'd want to say "strip symbols please", and remember, symbols can be embedded by other library authors, so how else would you opt-out?

The discussion is not about "if you want or not want symbols". I think, it is totally clear why symbols are needed. The discussion is about how to make them available in case you are in need for them.

Overall, there's no technical merit in using snupkgs over embedded symbols

IMO, it is the contrary (also revisit my arguments from above).

If you want to have a private NuGet package server, you have to set it up anyway and enabling symbol package support is very little effort. If you do not need a private package server, you can also make your symbol packages publicly available, e.g. on nuget.org.

For pushing snupkgs to the server from some client, e.g. a build agent, there is not a difference at all. If there is a symbol package next to the NuGet package, it will be automatically pushed to the server, too.

slang25 commented 5 months ago

The discussion is not about "if you want or not want symbols". I think, it is totally clear why symbols are needed.

I was addressing your opening remark which says that symbols are useful for diagnostic tools. I think this might be a critical part of the discussion, because it's easy to draw different conclusions depending on what you think we should be optimizing for.

My stance is that symbols are the norm, you want them at development, you want them at test, you want them in production. It's the exception to remove them.

With this as a starting point, it's not logical to have them separate. Yes you can, but it is adding unnecessary moving parts that only benefit folks who have a scenario where they want to strip symbols. This is odd to me.

nx-s commented 5 months ago

@slang25 okay, I do understand now what you wanted to point out.

  1. While I am with you that we want symbols in production, too, I also see reasons to not embed them by default. Imagine a productive scenario which require a high number of replicas (e.g. in k8s -- which is not a minor edge case, Imo) and with a high number of external dependencies/libs. With embedded symbols you add to each such DLL an extra of 20-30% to its size which needs to be loaded into main memory. That quickly can sum up overall instances.

Yes, on a crash/unhandled exception (and only then) u either have to be prepared to automatically generate minidumps for being able to make immediate sense of a stack trace or you lose the convenience to get all valuable information from the stack at the time of the crash. Still, you can diagnose all libs from production afterwards by using a symbol server, i.e. during analyzing the minidump or by reproducing the error with the very same libs + symbols.

There is also no difference with embedded vs. symbol server if you really want and can access your application on production for debugging/profiling at runtime.

  1. Also during development, when building your application, embedded symbols causes build agents to run longer (as potentially many, now bigger DLLs need to be downloaded), and that may happen with each commit on each built (depending on the setup).

As it makes no difference from usage point of view, if symbols are embedded or automatically taken once from a symbol server when required, Imo it is preferable to not embed symbols by default as you do not lose much convenience but instead can achieve some gain.

However, I also see scenarios where you want the stack trace information immediately from production and you don't care about some little extra resource usage of a single application. In this case, embedding symbols can be more convenient.

What should be the default behavior seems to me, now, highly subjective :)

bruno-garcia commented 5 months ago

I'm very bias here as I work for Sentry. But I was in favor of embedded by default for the longest time, and now I'm not. Here's why:

In development, IDEs already do a great job at fetching debug files when needed. But you also want to get production errors that are as actionable as possible. This means at least line numbers and function names. Seems like everyone agrees with that.

We all agreed there are reasons not to include the debug data directly in DLL, like disk space, downloading extra bytes, serverless cold start (download size again) etc mentioned above.

But it turns out all you need is to make sure your error/crash reports include the metadata needed in order to fetch symbols after the fact. And reconstruct the stack trace with line numbers.

exception.ToString() is not your friend here. You need to add extra data.

At Sentry you can either upload your PDBs, or Sentry will fetch them from a symbol server (your own, or some built-in ones. Including nuget.org). So your app doesn't need to have the PDBs in production and you still get good looking stack traces.

image

image

If source_link is around Sentry can even fetch parts of the code and include in the stack trace to give more context about what's happening. And also link to the line of code in GitHub. That works with C# exceptions on Windows, macOS, Linux, Android, iOS and Mac Catalyst (last two only with NativeAOT, not MonoAOT). And on Native AOT it also includes hard crashes done through P/Invoke into native code which is what folks expect in native code anyway, no one bundles debug files for native apps because they are huge.

image

In summary now I think we should just default to spit out DLL + PDB and make sure we publish nupkg + snupkg. And hook up a tool that gives gives you the context you need to figure stuff out.

Much easier to parse than a raw logs: image