dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.97k stars 4.65k forks source link

Portable Data Contract-based DAC in .NET 9+ #99298

Open lambdageek opened 6 months ago

lambdageek commented 6 months ago

Summary

The DAC is a .NET runtime component responsible for helping debuggers and other diagnostic tools to make sense of the memory of a .NET runtime process. In the current design, each instance of the .NET runtime is tightly coupled to a DAC: a runtime running on a a 32-bit little endian architecture can only be accessed by a 32-bit LE DAC, 64-bit with a 64-bit DAC, etc. Although the current runtime data structures are carefully designed so that a DAC hosted on Windows can make sense of a Linux or Mac .NET runtime, this is fragile and hard to maintain. This is because the current DAC implementation is actually a special build of the .NET runtime that uses C++ smart pointers to hide the fact that it is not accessing the memory of the host process, but rather the memory of a remote debuggee .NET process.

Additionally, the current DAC is not version-resilient. When a new verson of the .NET runtime is released, it ships with a DAC that can access that versions' memory and nothing else. This means that debuggers need to be able to locate, load and run a DAC corresponding to each runtime version. While this might be possible for official releases, it may create complications for source-built .NET runtimes, for unsupported/community-supported platforms etc.

Portable Data Contract-based DAC

An alternate approach is for the .NET runtime to include a data stream that encodes information about itself: the size and endianness of machine words, the size and offsets of important fields in interesting runtime data structures, the locations of globals that are relevant for diagnostic tools.

With access to a data stream, the DAC can be a separate implementation that can abstract over the details of a particular version of the runtime and implement an abstract data contract based diagnostic tooling. (This is abbreviated cDAC)

.NET 10 Plan

In .NET 10 we will begin to move the DAC toward a data-contract based approach by enabling the existing DAC to delegate some operations to the cDAC. Our initial focus will be the SOS tool used by the .NET LLDB plugin and by Windbg. Our initial milestone will be the implementation of the !PrintException SOS command.

A goal for the cDAC implementation is to maintain backward- and platform- compatability with each release: the each future release of the cDAC will be able to access each .NET runtime version >= .NET 9. We do not intend to support .NET <= 8 or .NET Framework using the cDAC.

The focus for .NET 9 is to host the DAC on 64-bit desktop platforms - Linux (glibc), MacOS and Windows - but to be able to access a debuggee or crash dump from 32-bit and 64-bit processes on all supported (and ideally community-supported) platforms (including musl-based Linux, arm32, win-x86, etc).

First milestone

The first milestone is a working !PrintException command based on the cDAC in windbg/SOS

Post-.NET 9 work

The overall goal is to implement a subset of the ISOSDacInterfaceNN IDL interfaces in sospriv.idl via the cDAC

Other closed issues

.NET 9 backports

We maintain a branch feature/9.0-cdac-backports that has selected data descriptor and contract changes necessary for the cdacreader to interrogate a net9.0 runtime. The backports to this branch are tracked on https://github.com/dotnet/runtime/issues/99302.

Future work

The initial cDAC plan will not reduce the complexity of the current design. However once the existing DAC delegates all commands to the cDAC, we may simplify the packaging and distribution of the cDAC - allowing diagnostics tools to obtain a single library that works with all versions of CoreCLR >= .NET 9, as well as other .NET runtimes including NativeAOT and Mono.

Special attention needs to be paid to the crashdump tool. This ships with CoreCLR and NativeAOT and runs on the same host as the .NET runtime. It uses a subset of the DAC in order to save a copy

ghost commented 6 months ago

Tagging subscribers to this area: @tommcdon See info in area-owners.md if you want to be subscribed.

Issue Details
# Summary The [DAC](https://github.com/dotnet/runtime/blob/main/docs/design/coreclr/botr/dac-notes.md) is a .NET runtime component responsible for helping debuggers and other diagnostic tools to make sense of the memory of a .NET runtime process. In the current design, each instance of the .NET runtime is tightly coupled to a DAC: a runtime running on a a 32-bit LE architecture can only be accessed by a 32-bit LE DAC, 64-bit with a 64-bit DAC, etc. Although the current runtime data structures are carefully designed so that a DAC hosted on Windows can make sense of a Linux or MAC .NET runtime, this is fragile and hard to maintain. This is because the current DAC implementation is actually a special build of the .NET runtime that uses C++ smart pointers to hide the fact that it is not accessing the memory of the host process, but rather the memory of a remote debuggee .NET process. Additionally, the current DAC is not version-resilient. When a new verson of the .NET runtime is release, it ships with a DAC that can access that versions' memory and nothing else. This means that debuggers need to be able to locate, load and run a DAC corresponding to each runtime version. While this might be possible for official releases, it may create complications for source-built .NET runtimes, for unsupported/community-supported platforms etc. # Portable Data Contract-based DAC An alternate approach is for the .NET runtime to include a *data stream* that encodes information about itself: the size and endianness of machine words, the size and offsets of important fields in interesting runtime data structures, the locations of globals that are relevant for diagnostic tools. With access to a data stream, the DAC can be a separate implementation that can abstract over the details of a particular version of the runtime and implement an abstract *data contract* based diagnostic tooling. (This is abbreviated *cDAC*) # .NET 9 Plan In .NET 9 we will begin to move the DAC toward a data-contract based approach by enabling the existing DAC to delegate some operations to the cDAC. Our initial focus will be the [SOS](https://github.com/dotnet/diagnostics/blob/main/documentation/sos_printexception_walkthrough.md) tool used by the .NET *LLDB* plugin and by Windbg. Our initial milestone will be the implementation of the `!PrintException` SOS command. A goal for the cDAC implementation is to maintain backward- and platform- compatability with each release: the each future release of the cDAC will be able to access each .NET runtime version >= .NET 9. We do not intend to support .NET <= 8 or .NET Framework using the cDAC. The focus for .NET 9 is to host the DAC on 64-bit desktop platforms - Linux (glibc), MacOS and Windows - but to be able to access a debuggee or crash dump from 32-bit and 64-bit processes on all supported (and ideally community-supported) platforms (including musl-based Linux, arm32, win-x86, etc). - [ ] Publish data stream spec - [ ] Implement basic data stream reader and writer in C - [ ] Enable the DAC to delegate operations to the cDAC - [ ] Implement `!PrintException` - [ ] Implement additional commands (TBD) # Future plans The initial cDAC plan will not reduce the complexity of the current design. However once the existing DAC delegates all commands to the cDAC, we may simplify the packaging and distribution of the cDAC - allowing diagnostics tools to obtain a single library that works with all versions of CoreCLR >= .NET 9, as well as other .NET runtimes including NativeAOT and Mono. Special attention needs to be paid to the `crashdump` tool. This ships with CoreCLR and NativeAOT and runs on the same host as the .NET runtime. It uses a subset of the DAC in order to save a copy - [ ] Full cDAC parity - [ ] Publish cDAC as a separate nuget - [ ] cDAC support for NativeAOT - [ ] cDAC support to Mono - [ ] Implement a simplified cDAC for `crashdump`
Author: lambdageek
Assignees: lambdageek, elinor-fung
Labels: `area-Diagnostics-coreclr`
Milestone: 9.0.0
lambdageek commented 6 months ago

/cc @AaronRobinsonMSFT @steveisok @davidwrighton @mikem8361 @noahfalk