ikvmnet / ikvm

A Java Virtual Machine and Bytecode-to-IL Converter for .NET
Other
1.24k stars 116 forks source link

IKVM.Reflection ILGenerator/PDB Rewrite #478

Closed wasabii closed 7 months ago

wasabii commented 8 months ago

Rewrote ILGenerator by taking the .NET Core RuntimeILGenerator code and a bit of code from MethodBuilder and retrofitting it to support ISymbolWriter and the other stuff it needed for saving. The end result is a hodge podge of code from Core and some Framework retrofits that were removed when AssemblyBuilder.Save was removed in Core.

But the end result is an ILGenerator instance that borrows the logic from .NET Core. Including it's more modern ECMA allowances, like backwards branches, etc. It generates IL exactly like .NET Core System.Reflection does since it's no longer our code.

PDB support is implemented through MetadataPdbSymbolWriter, which holds the symbol information until it's requested to write it into a MetadataBuilder. ModuleWriter now takes various options for a PdbStream and/or output file name and checksum hash. and uses PortablePdbBuilder to output the portable PDB metadata.

Removed MDB support and non-Portable PDB support. I'm sure Roslyn wishes they could do this too.

So, this PR should make it possible to generate and use IKVM PDBs on non-Windows.

ikvmc debug options are now more inline with csc.exe: -debug, -debug:full, :portable, :embedded. -debug:portable is the default and enabled on both IKVM.NET.Sdk and IkvmReference when DebugType is set.

wasabii commented 8 months ago

The ikvm.lang.DllExport feature was removed. This was insane. Generating native stubs? No need for this. Removed so we didn't have to fix up the native stub generation stuff.... because yeah.

wasabii commented 8 months ago

I think probably the best path forward for IKVM.Reflection.Emit at this point is to simply copy the .NET Core implementation of System.Reflection.Emit and patch up the missing peices (Saving, Symbols, etc). But, because our ModuleBuilder is pretty tied to our Module, and our AssemblyBuilder is pretty tied to our Assembly, etc, this isn't an easy task without also taking System.Reflection itself. Which might be an option. It might not be too hard to retrofit SRM into the reading side of things. This PR doesn't do that.

wasabii commented 7 months ago

@AaronRobinsonMSFT Heya. I've been working on this PR for a couple weeks now, and have run into an issue that I think you may be positioned to provide assistance on. I figured I'd reach out after having drafted and deleted a issue on dotnet/runtime 3 times, having failed to figure out how to properly explain the context (so much context).

The PDBs I'm generating using SRME work perfectly in .NET Framework. They however crash .NET 6, 7 and 8 with an access violation. The access violation happens when exceptions are thrown and stack traces are collected, and it starts trying to find debug info. At a minimum, I figure no matter what heinous thing I'm doing with the PDB, the runtime shouldn't be crashing.

I've isolated it to the SequencePoints. If I leave everything exactly as is, but remove the sequence points (second column of MethodDebugInfo to nil), Core doesn't crash. Beyond that I haven't made progress.

I built a checked version of corert 6. It doesn't crash! Runs fine there.

I am having a hell of a time making progress on the issue.

Fatal error. System.AccessViolationException: Attempted to read or write protected memory. This is often an indication that other memory is corrupt.
   at java.lang.Object..ctor()
   at IKVM.ConsoleApp.Program.Main(System.String[])
AaronRobinsonMSFT commented 7 months ago

@wasabii Can I reproduce this locally somehow? Happy to help debug it.

The PDBs I'm generating using SRME work perfectly in .NET Framework.

What version of .NET Core are you using? Sharing the assembly and PDB might be sufficient for me to root out the problem.

If you wanted another sanity mechanism for the Portable PDB (PPDB) format, I assume that is the one in question here, you can check out DNMD. It contains a Metadata dumping utility that walks all the tables and decodes the known blobs for the PPDB scenario.

wasabii commented 7 months ago

@AaronRobinsonMSFT

https://alethicdevops.blob.core.windows.net/temp/IKVM.ConsoleApp.net6.0.zip?sv=2023-01-03&st=2024-02-10T15%3A16%3A34Z&se=2024-02-11T15%3A16%3A34Z&sr=b&sp=r&sig=qesk2i3kF28wEUx5JZJKI6sMtsNlaX4cOoceVHOrnMs%3D

Threw this into a zip. It is a simple console app that enters into IKVM by just creating a new java.lang.Object. I use it to do one-off ad-hoc tests.

For me, running it "dotnet IKVM.ConsoleApp.dll" crashes. I believe that'll run it on 6.0.26 (locally installed) since that's the latest I have installed.

If I just remove IKVM.Java.pdb, it does not crash.

I'm taking a look at those tools you mentioned. For awhile I was able to open the PDB in dotPeek. But, it became too large at some point and now mostly just crashes dotPeek too. :)

"corerun.exe IKVM.ConsoleApp.dll" however does not crash from a locally built copy of 6.0.26.

AaronRobinsonMSFT commented 7 months ago

This is a nasty issue, I see the general problem. At first glance at least, this looks like an issue with DIA, which is owned by another team. I will reach out to the appropriate party on Monday (12-FEB) and let you know. Not much you can do to work around this other than deleting the PDB, sorry.

wasabii commented 7 months ago

Awesome. Thanks so much for taking a look.

AaronRobinsonMSFT commented 7 months ago

@wasabii This is a known issue and a fix has been made for it. Still working through when this will ship though.

wasabii commented 7 months ago

@AaronRobinsonMSFT

No workaround though? Can you explain the issue for me, or get me towards where it's described?

Is it planned to be fixed in all runtime versions?

AaronRobinsonMSFT commented 7 months ago

No workaround though? Can you explain the issue for me, or get me towards where it's described?

@wasabii The first indication was at https://github.com/dotnet/fsharp/issues/16399.

Is it planned to be fixed in all runtime versions?

Not at present since using DebuggableAttribute.IgnoreSymbolStoreSequencePoints should address the issue. Once we confirm the assemblies all have that bit set, we will make that kind of decision.

/cc @hoyosjs

hoyosjs commented 7 months ago

IKVM.Java is the first module I see that lacks the attribute and triggers the issue. There may be more - this is just the first one I know triggers the issue.

wasabii commented 7 months ago

Okay. So.... maybe I'm not understanding. SequencePoints then play two pivotale roles, right? One is to provide the debugger with safe spots that can be referred to: but this isn't needed, as the information can also be derived from NOPs that have been inserted. But, the other goal of sequence points, and the one I'm most concerned with, is line/column offsets in the active Document.

Am I correct then that placing this attribute on an assembly causes the Sequence Points to be ignored for the purposes of the primary issue? But, continues to allow them to be used by the debugger to determine where in the source file you are?

hoyosjs commented 7 months ago

Yes, that flag is concerned with the JIT's use of that information vs the implicit sequence points. The PDB still gets used when doing Exception.ToString()

wasabii commented 7 months ago

@AaronRobinsonMSFT

Any idea why the addition of this attribute might result in prolific OutOfMemoryExceptions?

Specifically only on Windows. Linux and OSX seem to work fine.

AaronRobinsonMSFT commented 7 months ago

@AaronRobinsonMSFT

Any idea why the addition of this attribute might result in prolific OutOfMemoryExceptions?

Specifically only on Windows. Linux and OSX seem to work fine.

No. That is unexpected. If you can reproduce this please share or attach WinDBG and collect some of the stacks that are causing the OOMs.

wasabii commented 7 months ago

I haven't been able to locally. Yet. Everything that is failing in our CI seems to work fine locally. Going to try to run one of these failing tests in a loop and see if it hits harder.

wasabii commented 7 months ago

@AaronRobinsonMSFT The OOM errors have gone away after I disabled Code Coverage tracking during tests. So, if there's an issue, it relates to that. I am completely unfamiliar with how the code coverage stuff operates at this point. Gotta dig into that.

wasabii commented 7 months ago

@AaronRobinsonMSFT

Okay. So, the crash is gone with code coverage out. But, I'm still seeing a bunch of internal BadImageFormatExceptions happening during the collection of stack traces.

Exception thrown: 'System.BadImageFormatException' in System.Reflection.Metadata.dll

I can get a debugger into most of this.

Inside of System.Diagnostics.StackFrameHelper:InitializeSourceInfo.

This block of code:


                for (int index = 0; index < iFrameCount; index++)
                {
                    // If there was some reason not to try get the symbols from the portable PDB reader like the module was
                    // ENC or the source/line info was already retrieved, the method token is 0.
                    if (rgiMethodToken![index] != 0)
                    {
                        s_getSourceLineInfo!(rgAssembly![index], rgAssemblyPath![index]!, rgLoadedPeAddress![index], rgiLoadedPeSize![index], rgiIsFileLayout![index],
                            rgInMemoryPdbAddress![index], rgiInMemoryPdbSize![index], rgiMethodToken![index],
                            rgiILOffset![index], out rgFilename![index], out rgiLineNumber![index], out rgiColumnNumber![index]);
                    }
                }

It's calling into s_getSourceLineInfo, which itself is calling System.Reflection.Throw.ThrowOutOfBounds. in System.Reflection.Metadata.

I can't debug into that easily right now. But, what I think is happening is it's failing to load the MethodDebugInfo associated with the MethodDef. Looking into these arrays, it fails on, say, index 9 (just an example). So, in rgiMethodToken, I see index 9 has the value of 100798173. Assuming that's supposed to be a MethodDefHandle...... that value is way too high. So, it turns it into a MethodDebugInformationHandle and tries to read that and it's way beyond the end of the table so it fails.

There are "only" 252925 rows in the MethodDef table. So, a value of 798173 is way outside that. I'm pretty sure these aren't like compressed ints or anything, but just tokens. So, 100798173 should be row id 798173....

So yeah, way off the end of the table.

I can't easily trace this back into the origin of these tokens. So I'm not sure where it's getting these numbers.... but it seems like it is getting the wrong ones.

wasabii commented 7 months ago

Okay. I made a mistake. That number is correct. What I'm actually seeing in this attempt though is 0x06000000. Which is a Nil token.

But I think I understand it now. That's a dynamic method.

Okay. I think there is something I'd consider a bug in here. When PortablePDBs are enabled for an assembly, and there are dynamic methods attached to it, the token makes it through into here as a nil token with a methoddef kind. Thus it doesn't get filtered out by the if (handle.Kind == HandleKind.MethodDefinition) line. And the lookup attempts to go through to the MetadataReader attached to the PDB. Which then throws an OutOfBounds exception. Which I believe itself needs a stacktrace.

wasabii commented 2 months ago

https://github.com/dotnet/runtime/issues/98506