Closed mattjohnsonpint closed 1 year ago
The code that resolves the assembly debug information is here:
I presume that we're not able to read the assembly directly from a file when that file is a single-file-executable. We need to figure out how to get a PEReader
in that environment.
If necessary, we can make another AssemblyReader
implementation, similar to AndroidAssemblyReader
- but it also might be possible just to do this directly in TryReadAssembly
.
Note from: https://learn.microsoft.com/dotnet/core/deploying/single-file/overview?tabs=cli#api-incompatibility
Module.FullyQualifiedName
- Returns a string with the value of<Unknown>
or throws an exception.
Thus, we'll need to figure out first how to get something more usable directly from the Module
passed in to GetDebugImage
.
I'm getting slightly different behaviour than what's described in the issue details. When I run:
dotnet publish -c Release -p:PublishSingleFile=true
...the build does not succeed and there is nothing in ./bin/Release/
that could be run.
I get aproximately 10 separate errors from the compiler, all of which are instances of the following 3 errors:
error IL3002: Using member 'System.Reflection.Module.Name' which has 'RequiresAssemblyFilesAttribute' can break functionality when embedded in a single-file app. Returns <Unknown> for modules with no file path.
error IL3002: Using member 'System.Reflection.Module.FullyQualifiedName' which has 'RequiresAssemblyFilesAttribute' can break functionality when embedded in a single-file app. Returns <Unknown> for modules with no file path.
error IL3000: 'System.Reflection.Assembly.Location' always returns an empty string for assemblies embedded in a single-file app. If the path to the app directory is needed, consider calling 'System.AppContext.BaseDirectory'.
Some of those errors come from Ben.Demystifier/TypeNameHelper.cs
, Ben.Demystifier/ResolvedParameter.cs
and Ben.Demystifier/Internal/PortablePdbReader.cs
.
Others come from Sentry/PlatformAbstractions/RuntimeInfo.cs
.
Finally some come from Sentry/Internal/DebugStackTrace.cs
.
So it seems we'd need to leverage alternatives to those methods in multiple places in our codebase to fix this.
One possible solution described here:
... use AssemblyExtensions.TryGetRawMetadata. This method returns just the metadata blob, not the whole assembly. It works well for System.Reflection.Metadata reader and https://github.com/dotnet/runtime/issues/36590#issuecomment-688030287. I am not sure whether it works for Cecil.
Interesting. The errors make sense, but I wonder why I didn't get them and you did. But anyway, they make sense.
Great find on AssemblyExtensions.TryGetRawMetadata
. That sounds super promising. In theory, that approach could relieve us from needing to read the assembly from the file in all cases. Please explore that a bit. Thanks.
... though looking closer, I can't tell whether the values we need are actually in that metadata. One can get a MetadataReader
from a PEReader
, but not the opposite direction - so unless the PEHeader
, CoffHeader
, and DebugDirectoryEntries
are available in the metadata somewhere I'm not seeing, then I don't think that will work.
The Sentry SDK uses reflection to capture information about the stack trace:
Some of that reflection code currently relies on a file path to the assembly for the module being reflected on. This is problematic when the application is published as a single-file executable, because the assembly is embedded in the executable and does not have any file path.
Ultimately our code is trying to assemble the following:
var debugImage = new DebugImage
{
Type // "pe_dotnet",
CodeId // $"{headers.CoffHeader.TimeDateStamp:X8}{peHeader.SizeOfImage:x}"
CodeFile // module.FullyQualifiedName
DebugId // $"{codeView.Guid}-{entry.Stamp:x8}" or $"{codeView.Guid}-{codeView.Age}"
DebugChecksum // $"{checksum.AlgorithmName}:{checksumHex}"
DebugFile // peReader.ReadCodeViewDebugDirectoryData(entry).Path
ModuleVersionId // module.ModuleVersionId,
};
All of this is presumably just information to help Sentry track down the appropriate debug symbols, when the debug image is uploaded to Sentry (along with the rest of the exception information).
Much of this information comes from the PEReader.PEHeaders
... and it doesn't appear to be available anywhere else.
The most obvious solution would be to use the PEReader
to read in the information about a module/assembly that was
loaded in memory rather than one that was located on disk... and the PEReader
appears to have a constructor
that could be used for this purpose:
PEReader(Byte*, Int32) // Creates a Portable Executable reader over a PE image stored in memory.
However I haven't found anyway to work out where embedded modules are located in memory or their size.
This is done by Ben.Demystifier
, which only needs a MetaDataReader
. We could potentially use
something like this to get a MetaDataReader
for a module, without needing to know where the assembly for the module was located on disk:
if (!module.Assembly.TryGetRawMetadata(out byte* blob, out int length))
{
return string.Empty;
}
var moduleMetadata = ModuleMetadata.CreateFromMetadata((IntPtr)blob, length);
moduleMetadata.GetMetadataReader();
To use TryGetRawMetadata
we'd need to:
System.Runtime.Loader
Nuget packageunsafe
code in the assembly where that code is used, since it fiddles with pointersWe'd also need to be able to make a pull request to the Ben.Demystifier project (or fork this) to modify the code in that package which assumes assembly modules will be in separate files.
The following are the specific points in our code that are currently problematic when running as a single-file executable.
You can see these for yourself by creating the sample project referenced in the description of the problem and referencing the Sentry.csproj file directly (rather than the nuget package) and then running the following:
dotnet publish -c Release -p:PublishSingleFile=true
But I've summarized them here for convenience.
IL3002: Using member 'System.Reflection.Module.Name' which has 'RequiresAssemblyFilesAttribute' can break functionality
when embedded in a single-file app. Returns
IL3000: 'System.Reflection.Assembly.Location' always returns an empty string for assemblies embedded in a single-file app. If the path to the app directory is needed, consider calling 'System.AppContext.BaseDirectory'.
IL3002: Using member 'System.Reflection.Module.FullyQualifiedName' which has 'RequiresAssemblyFilesAttribute' can break
functionality when embedded in a single-file app. Returns
Great job on the research. It seems some of these are more possible than others, but there's no quick-fix.
As for Ben.Demystifier - we are already taking our submodule from a fork at https://github.com/getsentry/Ben.Demystifier - so you can make modifications there. We can then work to merge those changes upstream separately.
Possibly some more progress. ILSpy can open self-contained executables... and ILSpy is open source.
The entry point to the relevant parts of their code, I believe, is the AssemblyTreeNode.LoadChildren()
method... when you expand a tree node in the ILSpy UI that represents an assembly (which I think is what our single file executables would be represented as), this is the code that runs to enumerate all the various modules that are bundled into the single file executable.
The remaining challenge then is reverse engineering the ILSpy code to work out how they're doing that.
OK, ILSpy is cunning. Here's what it's doing:
If it can't load the assembly from a file, it then checks to see if it can load it from a bundle
var bundle = LoadedPackage.FromBundle(fileName);
Basically it loads the whole package into into a MemoryMappedFile
and then it hunts for a bundle signature in that memory stream. If it finds one, it can then return the bundleHeaderOffset
which is what is used to lookup all the other bundle entries.
Most of the work there happens in the SingleFileBundle.ReadManifest(Stream stream)
method. It's this that enumerates all of the entries (which include resource files being bundled with the single file executable, but also any bundled assemblies) with useful stuff like the offset, within the file/stream where that entry is kept.
In ILSpy at least, this happens when you try to expand the node representing an embedded assembly in the ILSpy UI. That's where the PackageFolderTreeNode.LoadChildrenForFolder
method gets invoked, and there's some specific logic in there to handle dlls (I think folders are only relevant for resources - not for embedded assemblies).
This is where the offsets for bundled assemblies that were collected from the package manifest are used to load the bundled assemblies from memory - which happens in LoadedPackage.ResolveFileName(string name)
.
There's a bit of inception going on at this point. The LoadedAssembly
constructor is called. One of the parameters that gets passed in is Task.Run(entry.TryOpenStream)
... In the case of bundled assemblies, the concrete implementaion of TryOpenStream that gets called is eventually BundleEntry.TryOpenStream
. This method is critical as it's where the logic to decompress bundles is implemented, if necessary. Otherwise, if the assembly hasn't been bundled compressed, a plain vanilla UnmanagedMemoryStream
gets returned starting at the appropriate entry offset.
Finally, once that Task completes and hands back a stream for the assembly we want, this gets used in the LoadedAssembly
constructor in a call to LoadAsync
... which is the same method that loaded our single file executable... only this time, the branch of code that gets executed is not that dealing with bundles but the one that loads vanilla assemblies from a memory stream.
Thankfully, ILSpy also has an MIT License... so it'd be OK to copy/reuse whichever bits of this logic were appropriate.
Sounds like we're on the right path. Cool!
Thankfully, ILSpy also has an MIT License... so it'd be OK to copy/reuse whichever bits of this logic were appropriate.
If we are just learning from ILSpy and using the same approach, that's fine. If you actually need to copy code from the ILSpy project, please put it in its own subdirectory and add an attribution file. For example, see /src/Sentry/Internal/FastSerialization
- which is another bit of code we've internalized. Thanks.
Package
Sentry
.NET Flavor
.NET
.NET Version
7.0.2013
OS
Any (not platform specific)
SDK Version
3.31.0
Steps to Reproduce
Create a simple console app:
In the csproj, configure to upload symbols and sources to Sentry (and authenticate with
sentry-cli login
).Compile and publish with:
Run the app from its published folder:
Expected Result
The event generated and shown in the console debug output should contain a
debug_meta
section, including thedebug_id
.In Sentry, the source context should be visible, and the debug images section should show the symbols were found.
Actual Result
The event is missing the
debug_meta
section when-p:PublishSingleFile=true
is used. Thus, source context is not shown.Line numbers will still be shown if the
.pdb
file is present (which it is by default in thepublish
folder), but if you delete it - or ship the executable app without the pdb file, then client-side symbolication won't occur. Server-side symbolication also won't occur because thedebug_meta
section is missing from the event.