dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.48k stars 4.76k forks source link

ilasm throws an error when partly-defined .line directives are used #46328

Open ivanpovazan opened 3 years ago

ivanpovazan commented 3 years ago

Description

ILAsm throws the following error:

"Could not create output file, error code=0x80004005"

when the input source file includes a partly-defined .line directive, for example:

.method public hidebysig static int32 Foo(int32 a) cil managed { .line 15:3 'some_external_file.ext' IL_0000: nop IL_0001: ret }

Partly-defined .line directives are those which do not specify all of the optional start/end line/column parameters.

Configuration

.NET 5.0 ilasm

Other information

There are two problems related to this issue in the ilasm implementation:

1. .line directive rules

For .NET Core there is a discrepancy between ILAsm and Portable PDB specification and requirements for .line directive.

= = = 1 = taken from the previous .line directive

... End Line is greater or equal to Start Line. If Start Line is equal to End Line then End Column is greater than Start Column.

As it can be seen Portable PDB requires EndColumn to be greater than StartColumn in cases when StartLine==EndLine, which is different from the ILAsm specification.

Suggested approach: My suggestion is to follow the ILAsm rules with the exception for EndColumn, where Portable PDB rule should be applied.

2. ilasm behavior on invalid .line directives

If a malformed .line directive is detected, ilasm rudely throws an error without generating the assembly file nor the PDB.

Suggested approach: In case of PDB generation errors ilasm should indeed refuse to generate the PDB file, but I think that assemblies should still be generated with a proper warning message indicating that the PDB generation has failed.

ivanpovazan commented 3 years ago

@jkotas I am not sure who should be included from the MS team regarding the above-mentioned problems (and suggested approaches). Could you please tag those who would be interested to comment? Thank you in advance.

jkotas commented 3 years ago

Portable PDB https://github.com/dotnet/runtime/blob/master/docs/design/specs/PortablePdb-Metadata.md The values of non-hidden sequence point must satisfy the following constraints

@tmat Where is this requirement coming from? How does this work for classic PDBs?

I think that assemblies should still be generated with a proper warning message indicating that the PDB generation has failed

I do not think it is a good idea to generate .dll without .pdb. We can either:

In either case, we should print better error message.

jkotas commented 3 years ago

cc @briansull

ivanpovazan commented 3 years ago

An additional note:

In either case, we should print better error message.

Sorry, I did not provide you with the full ilasm output. When I was implementing portable PDB support, I've added reporting of warning messages for these cases, which results with the following:

warning : Sequence point at line: [0xf] and offset: [0x0] in method 'Foo' is not valid! Error: failed to emit body of 'Foo' Could not create output file, error code=0x80004005

Perhaps warnings should be promoted to errors or something like that. Let me know what you think.

jkotas commented 3 years ago

I think we should produce warning, but still emit the invalid sequence point (ie do not treat the warning as error).

It is a feature that ilasm can create invalid binaries. It is very useful for testing.

ivanpovazan commented 3 years ago

I agree. I can do the fix once there is a clarification/agreement on the .line directive rules problem.

jkotas commented 3 years ago

My suggestion is to follow the ILAsm rules with the exception for EndColumn, where Portable PDB rule should be applied.

What do you propose to set EndColumn to by default?

ivanpovazan commented 3 years ago

I would adapt the ILAsm rules for default values in the following way (bolded text refers to the portable PDB requirement):

= = **( == ) ? + 1** : = 1 = taken from the previous .line directive
jkotas commented 3 years ago

The ilasm with classic PDBs produced EndColumn=0. Can we do that for ilasm, for both regular sequence points and as the default for line directives?

The current scheme where ilasm produces endColumn=2 by default produces a bit odd debugging experience:

Classic PDBs: image

Portable PDBs: image

Notice the odd single character yellow highlight with Portable PDBs.

ivanpovazan commented 3 years ago

If that is acceptable by portable pdb standard, I am fine with it. I was just pointing out the rule which states the opposite

jkotas commented 3 years ago

We should update the portable pdb doc as part of this.

tmat commented 3 years ago

Unfortunately, these kinds of sequence points are not supported by Portable PDB.

If Start Line is equal to End Line then End Column is greater than Start Column.

This rule is important for the delta encoding. If Start Line = End Line (ΔLine = 0) and Start Column = End Column (ΔColumn = 0) then such value encodes a hidden sequence point.

The format expects all sequence points to be associated with a text span spanning the corresponding source code.

jkotas commented 3 years ago

Could ILASM determine the precise span of the instruction source?

It does not help with the explicit line directives with missing column information. The length of instruction source is irrelevant for explicit line directives. C# .line directives have the same problem. Would it make sense to emit max column value for these?

tmat commented 3 years ago

C# currently does not allow to specify column. It uses the start and end columns from the current source and sets the line to the one specified with the .line directive.

For some scenarios (like Razor), this is not good enough. We might need to change the language/compiler to allow specifying a column range. In any case, we wouldn't have a scenario when the column is undefined in C#.

If we think it's worth the effort we could use a special value (e.g. the current max+1: 0x10000) to represent an undefined column. The encoder/decoder could interpret this value as undefined column. The metadata reader APIs would need to be updated to allow for unspecified column (e.g. returning -1, which is what SymReader APIs are using for Windows PDBs for undefined columns). The debugger already interprets -1 as undefined column and does not highlight the span (only displays yellow arrow).

jkotas commented 3 years ago

I have done some ad-hoc testing and the max end column value (0xffff) works nicely. There is no odd single character yellow highlight, the whole line is highlighted.

I think we should emit undefined end column as 0xffff in ilasm. Does it sound reasonable?

If we think it's worth the effort we could use a special value (e.g. the current max+1: 0x10000)

I think we would want to consider this only if emitting undefined end column as 0xffff proves to be insufficient for some reason.

In any case, we wouldn't have a scenario when the column is undefined in C#.

I believe that there are quite a few situations out there where the columns are undefined. In particular, anything generated by C or C-like preprocessor does not track columns.

tmat commented 3 years ago

I think 0xffff is reasonable as a workaround.