jbevain / cecil

Cecil is a library to inspect, modify and create .NET programs and libraries.
MIT License
2.76k stars 627 forks source link

Emit sequence points in ascending offset order #853

Open Zastai opened 2 years ago

Zastai commented 2 years ago

This is required by the Portable PDB specification, and the sequence point collection in MethodDebugInfo does not guarantee it.

This also tweaks the writing of hidden sequence points to use two writes of 0 as compressed unsigned integer, instead of a single write of 0 as a plain short, to more closely match the specification.

Fixes #840.

Note: It would be extremely useful to get a 0.11.5 soon after this lands.

Zastai commented 2 years ago

@jbevain Any chance of this landing?

jbevain commented 2 years ago

@Zastai I think the assumption here is that it's on the user to provide a properly sorted SequencePoints collection.

Are there cases where we're reading portable pdbs and they end up unsorted in the collection?

Zastai commented 2 years ago

On my phone at the moment (and it's late) so can't check right now, but I assume I couldn't easily get that collection sorted from the client side. Otherwise I expect I would have done that.

Zastai commented 2 years ago

Ah I think I see the issue. My use case was flagging some code I add to a method as hidden, to make for smoother debugging. So I add a sequence point for the first of my newly-created insns. But I don't think such a newly-added insn resolves to an actual offset (yet), so I am unclear on how I could reasonably sort the sequencepoints collection at that point.

I'm not sure it makes sense for Cecil to allow creation of debug data that violates the specification, so applying the sort as part of the writing seems like a reasonable thing to do.

If there is a safe way for me to do the sort myself (maybe preferably by adding a Sort() or SortByOffset() on the sequence point collection), then sure, Cecil doesn't have to sort it for me. But even then it would at least make sense to throw when an out-of-sequence entry is detected while writing (better to throw when trying to create a broken pdb than to break it and cause errors later).

I'd also be fine with a new writer option, to specify whether you want the ensure-valid-sorted-sequence-points, throw-on-out-of-sequence-points, or leave-my-sequence-points-alone behaviour.

Zastai commented 1 year ago

@jbevain Any update on how you want me to proceed on this?

Zastai commented 1 year ago

@jbevain Ping (and slightly disappointed this did not make it into 0.11.5).

Zastai commented 1 year ago

@jbevain Another ping; it's now a year after this initial PR and it's still not clear how you want me to proceed.

I see three main options:

  1. Accept this PR, ensuring that Cecil writes a vaild PDB when sequencepoints are added.
  2. Change this PR to add sort functionality to the sequence points, e.g. adding a Sort[ByOffset]() extension method for Collection<SequencePoint> This assumes that this sort can be done safely (or at all) when newly created insns are referenced, because they might not have fully-resolved offsets yet. That would require clear documentation then that it should be called when finished manipulating a method body (e.g. after calling Optimize() on it.
  3. Add a symbol writer option SequencePointHandling, with an enum of the same name, to choose the behaviour: a. Keep (the current behaviour): emit the sequence point in the order they are found in the collection, even when that results in an invalid PDB file b. ThrowOnInvalidSequence: same as Keep, but throw an exception if an invalid sequence is detected c. EnsureValidSequence: the behaviour of this PR: emit them in the correct sequence
leppie commented 1 month ago

@jbevain I have tested fix as I started getting wrong info in portable PDB's converted from native.

WIth this fix you can roundtrip from native to portable and back to native and sequence points stay consistent.

PersistedAssemblyBuilder in .NET 9 actually has the same bug when emitting a PDB.

leppie commented 1 month ago

Some further investigation indicated that I was NOT creating sequence points out of order (that would be quite hard with ILGenerator), but were creating duplicate sequence points at the same offset.

This is causing the problem.