dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.16k stars 4.71k forks source link

Ordering unimplemented ILVerify errors by their impact in compiler output validation #37393

Open VSadov opened 6 years ago

VSadov commented 6 years ago

This is very subjective. Most unimplemented errors between UnknownOpcode and ThisMismatch in https://github.com/dotnet/corert/blob/master/src/ILVerify/src/VerifierError.cs are interesting.

My approach here is to prioritize the errors by the chances of seeing such violation without triggering some internal asserts in our compilers. Basically - bugs at higher levels of abstraction typically result in logically incorrect and yet valid IL, so IL verification cannot help there. However bugs at lower levels often result in malformed IL and that is where IL verification could be very helpful.

I would start with completing errors related to control flow, stack misbalances and uninitialized data. If IL is malformed due to a bug, it often ends up triggering these. Many look already implemented, but there are still some not yet done.

Then there are bugs where something went wrong with dataflow or casts. In such cases we generally see various stack corruptions and mishaps with type compatibility.

The shape of exception handling regions could be delicate, but the area is not currently in active development so these can be done later.

VSadov commented 6 years ago

CC: @jcouv @ArztSamuel @jkotas

ArztSamuel commented 6 years ago

Thanks, that is very helpful!

If I remember correctly the following ER specific errors are actually already reported, but with different names / messages:

//E_BR_OUTOF_FIN "Branch out of finally block."

Branching out of a finally block currently reports a BranchOutOfHandler error, since finally blocks are also categorized as handlers by our current data structures. It would be possible to change the code to report this more specific error instead quite easily, though. Also: I think it is a bit odd that there is a specific error for branching out of a finally block, but not for branching into one.

//E_BR_TO_EXCEPTION "branch/leave to the beginning of a catch/filter handler"

Branching into a catch/filter currently reports a BranchIntoHandler/Filter or LeaveIntoHandler/Filter error. I am not sure why this was defined as a seperate error by PEVerify, as branching/leaving into a catch/filter should always be invalid, independent of which instruction inside the catch/filter is the target (unless I missed something in the rules of ECMA I.12.4.2.8.2). I also can't find any part in the PEVerify source, which actually reports this error. If this is true, it would make sense to remove this unused error.

//E_FALLTHRU_EXCEP "fallthru the end of an exception block." //E_FALLTHRU_INTO_HND "fallthru into an exception handler." //E_FALLTHRU_INTO_FIL "fallthru into an exception filter."

All three of these cases currently report a corresponding branch error. It shouldn't be too much work to change this to actually report a fallthrough error instead.

//E_LEAVE "Leave from outside a try or catch block."

If I understand the PEVerify source correctly, this error was always reported, whenever an invalid leave target was found. When I implemented the checks for this, I also added a more specific error message for the different possible rule violations: LeaveIntoTry, LeaveIntoHandler, LeaveIntoFilter, LeaveOutOfFilter, LeaveOutOfFinally and LeaveOutOfFault. I thought the error message "Leave from outside a try or catch block." was a bit vague and misleading. If I understood the rules defined in ECMA correctly, using the leave instruction outside an exception region is actually valid and behaves like a normal branch instruction, but this error message sounds like it is prohibited. I also checked PEVerify and as expected it does not report this error for using the leave instruction outside a try/catch. Since this error was split into multiple more specific errors, it would make sense to remove it.

I will be looking into the errors mentioned above and change the code / add test cases if necessary.

I also hope to be able to have a quick look at the other errors soon and will comment here if I find anything worth noting. I think I remember that there are some other errors, which were defined in the PEVerify source, but never actually reported.

ArztSamuel commented 6 years ago

Resolved E_BR_OUTOF_FIN, E_BR_TO_EXCEPTION, E_FALLTHRU_EXCEP, E_FALLTHRU_INTO_HND, E_FALLTHRU_INTO_FIL and E_LEAVE through dotnet/corert#4960.

VSadov commented 6 years ago

Thanks!! I only used error list with assumption that if error is used then the scenario is most likely handled. The opposite is indeed not necessarily true.

Yes I also find it odd that there is an error for leave outside handlers. I think it is defined as goto with clear eval stack. Could even be useful if it really works like that. Perhaps there is some subtle rule that makes it illegal, or it is a JIT implementation thing.

Yes, I think it makes sense to continue using the error list for tracking and remove errors if corresponding scenarios are reported as something else.

ArztSamuel commented 6 years ago

I think it is defined as goto with clear eval stack.

Yes, true. I forgot to mention that the evaluation stack is additionally cleared by the leave instruction.

Perhaps there is some subtle rule that makes it illegal, or it is a JIT implementation thing.

PEVerify actually doesn't report a leave outside exception regions as an error (I have tested that). I have only seen this error being used for other rules of the leave instruction, like leaving out of a finally block. It is just the message that is kind of misleading.

A note in ECMA (I.12.4.2.8.2.8: leave and leave.s) even states:

In effect, a leave from outside of exception handling acts like a branch, with the side-effect of emptying the evaluation stack.

ArztSamuel commented 6 years ago

I have had a quick look at all errors now:

//E_STACK_EMPTY "Stack empty."

I am not sure what this error was defined for in PEVerify. I couldn't find any code that actually reports this error. When the stack is poped even though it is empty, ILVerify reports StackUnderflow.

//E_THIS_UNINIT_BR "Branch back when this is uninitialized."

This error is never reported by PEVerify and I even found a comment in the source code saying: // backwards branch with uninitialized 'this' is actually allowed (https://github.com/lewischeng-ms/sscli/blob/master/clr/src/jit64/newverify.cpp#L3494)

There is also a comment saying; //backwards branch with 'this' uninitialized is illegal but the code actually performing the check seems to have been removed. (https://github.com/lewischeng-ms/sscli/blob/master/clr/src/jit64/newverify.cpp#L1006) I would have to re-read the chapters about branching/this pointer state in order to confirm what ECMA defines in respect to this.

//E_RET_UNINIT "Return uninitialized data."

This error is never thrown inside the PEVerify code. PEVerify defined a flag to mark uninitialized ObjRefs, however this flag was only used to mark the this-pointer as uninitialized. I assume this error was defined for when an uninitialized ObjRef is returned, but was never used, since returning an uninitialized this-pointer was assigned its own error code.

//StackNotEq, // Non-compatible types on the stack.

This was reported by PEVerify when verifying compare instructions. I think IsBinaryComparable is already handling these rules, but always reports StackUnexpected. I would have to investigate this further though.

//E_STACK_NO_R_I8 "unexpected R, R4, R8, or I8 on the stack."

This error is only reported by PEVerify when a bool branch is performed with a value of type TI_DOUBLE. ILVerify simply reports StackUnexpected for this case and all other invalid types of values. I think it makes sense to remove this error, unless there is any benefit from having a more precise error for this case.

I wasn't able to find where quite a lot of errors are reported in PEVerify. I am not sure whether these errors are actually deprecated or simply reported in a code part that I do not have access to. Some of the errors definitiely sound like they still make sense though.

Here is the full list of errors I could not find in the PEVerify source:

//E_STACK_EXCEPTION "Missing stack slot for exception." //E_THIS "Instance variable (this) missing." //E_THIS_UNINIT_V_RET "Return from .ctor before all fields are initialized." //E_SIG_VAR_PARAM "Unrecognized type parameter of enclosing class." //E_SIG_MVAR_PARAM "Unrecognized type parameter of enclosing method." //E_SIG_VAR_ARG "Unrecognized type argument of referenced class instantiation." //E_SIG_MVAR_ARG "Unrecognized type argument of referenced method instantiation." //E_SIG_GENERICINST "Cannot resolve generic type." //E_PATH_LOC "Non-compatible types depending on path." //E_STACK_P_OBJREF "Expected address of an ObjRef on the stack." //E_STACK_ARRAY_SD "Expected single dimension array on the stack." //E_STACK_VALCLASS "Expected value type instance on the stack." //E_STACK_P_VALCLASS "Expected address of value type on the stack." //E_STACK_NO_VALCLASS "Unexpected value type instance on the stack." //E_ARRAY_ACCESS "Illegal array access." //E_ARRAY_V_STORE "Store non Object type into Object array." //E_TRY_EQ_HND_FIL "Try and filter/handler blocks are equivalent." //E_HND_OVERLAP "Handler block overlaps with another block." //E_FIL_OVERLAP "Filter block overlaps with another block." //E_FIL_EQ "Filter block is the same as another block." //E_FIL_START "Filter starts in the middle of an instruction." //E_ENDFILTER_MISSING "Missing Endfilter."

jkotas commented 6 years ago

I am not sure whether these errors are actually deprecated

These errors are deprecated. The implementation of PEVerify morphed a lot over the years - they were likely used in the earlier versions of PEVerify and later removed. Could you please submit PR to cleanup all the unused errors?

I think it makes sense to remove this error

Agree. BTW: I have run into nonsensical errors reported by PEVerify. I have tried to match PEVerify for some time to make side-by-side comparison of errors easier, and added a TODO to convert it to better error message later. I may be worth it to go over TODOs in ILVerify (to fix the error messages, and to find other non-implemented nugets). Opened dotnet/corert#4963 on this.

ArztSamuel commented 6 years ago

Could you please submit PR to cleanup all the unused errors?

Of course, done in dotnet/corert#4965.

VSadov commented 6 years ago

I also found some of the errors unfamiliar. I assumed they cover rare situations. I still ordered them by what looked to be their category - control flow, stack, etc, but some looked pretty strange or redundant.

Removing deprecated ones seems right thing to do

VSadov commented 6 years ago

//E_SIG_VAR_PARAM "Unrecognized type parameter of enclosing class." //E_SIG_MVAR_PARAM "Unrecognized type parameter of enclosing method." //E_SIG_VAR_ARG "Unrecognized type argument of referenced class instantiation." //E_SIG_MVAR_ARG "Unrecognized type argument of referenced method instantiation." //E_SIG_GENERICINST "Cannot resolve generic type."

These seem to refer to situations when, for example, code references a method type parameter while the type parameter does not exist of containing method is not even generic. Such errors are relatively easy to hit in broken generic code.
I am curious how these kind of errors are handled in ILVerify.

//E_PATH_LOC "Non-compatible types depending on path." //E_STACK_P_OBJREF "Expected address of an ObjRef on the stack." //E_STACK_ARRAY_SD "Expected single dimension array on the stack." //E_STACK_VALCLASS "Expected value type instance on the stack." //E_STACK_P_VALCLASS "Expected address of value type on the stack." //E_STACK_NO_VALCLASS "Unexpected value type instance on the stack." //E_ARRAY_ACCESS "Illegal array access." //E_ARRAY_V_STORE "Store non Object type into Object array."

I assume these were replaced by other errors about stack or expected type inconsistencies.

//E_TRY_EQ_HND_FIL "Try and filter/handler blocks are equivalent." //E_HND_OVERLAP "Handler block overlaps with another block." //E_FIL_OVERLAP "Filter block overlaps with another block." //E_FIL_EQ "Filter block is the same as another block." //E_FIL_START "Filter starts in the middle of an instruction." //E_ENDFILTER_MISSING "Missing Endfilter."

Interesting what happens if exception handling regions are incorrect as these errors mention. - other errors reported?

//E_RET_UNINIT "Return uninitialized data."

not sure how this even possible with initlocals present.

//E_THIS_UNINIT_BR "Branch back when this is uninitialized."

Did not know such rule exist(ed).

I would guess, if there are actions that are disallowed when this is uninitialized, then, having to derive in single pass where this is uninitialized could result in prohibiting branching back with uninitialized this.

Curious - why is this ok now?

jkotas commented 6 years ago

code references a method type parameter while the type parameter does not exist of containing method is not even generic

These fall into the metadata validation errors. ILVerify does a basic validation as part of the type loading (you may not always get a pretty error for invalid metadata like this).

here are actions that are disallowed when this is uninitialized, then, having to derive in single pass where this is uninitialized

The ECMA spec has the backward branch rule that requires the verification to be a single pass. The implemented IL verification has not ever been single pass - the backward branch rule has not been ever enforced.

VSadov commented 6 years ago

Being an independent tool it may make sense for ILVerify be a single pass. It might actually be easier for the implementation.

(by 1 pass i mean “logically 1 pass”, - I.E. not the kind that repeats the analysis once/if more data about backwards branches is discovered)

Perhaps the backward branch rule for uninitialized ‘this’ should be enforced. It should not be hard and I doubt there are violations.

jkotas commented 6 years ago

I think we should make ILVerify compatible with PEVerify by default. I do not think there is much value in enforcing stronger rules than what PEVerify and JIT-time verification enforced since forever.

It maybe ok to have a stronger rules in the opt-in warning mode: https://github.com/dotnet/corert/issues/3586

VSadov commented 6 years ago

Talked with @jkotas about the "single pass" requirements in more details.

ECMA335 does specify that

It shall be possible, with a single forward-pass through the CIL instruction stream for any
method, to infer the exact state of the evaluation stack at every instruction (where by “state” we
mean the number and type of each item on the evaluation stack).

However no real implementation seem to be requiring this. We should review whether the requirement should be dropped from the spec.

That said, some of these cases, while ok with the JIT and type-safety, are often indicative of bugs in IL generator. Besides, some rules are still enforced in 32bit JIT/PEVerify.

I think we should keep the rules that result from one-pass requirement, at least those that 32bit PEVerify implements and make them "warnings" as in dotnet/runtime#37390 . Unless it is a serious burden to implement, of course, since this would be going beyond the spec.

In particular the rule described in III.1.7.5 Backward branch constraints seems not hard to do.

VSadov commented 6 years ago

Still wonder about the following. Are different errors reported for these?

//E_TRY_EQ_HND_FIL "Try and filter/handler blocks are equivalent." //E_HND_OVERLAP "Handler block overlaps with another block." //E_FIL_OVERLAP "Filter block overlaps with another block." //E_FIL_EQ "Filter block is the same as another block." //E_FIL_START "Filter starts in the middle of an instruction." //E_ENDFILTER_MISSING "Missing Endfilter."

ArztSamuel commented 6 years ago

As far as I am aware, the validation of the start/end offsets of exception regions is not implemented for ILVerify at all yet. It could well be that these errors are handled by other errors in PEVerify, but I think it would make sense to keep these errors in the list of this issue until we have unit tests for ILVerify confirming that these cases are handled appropriately.

ArztSamuel commented 6 years ago

E_BAD_JMP_TARGET and E_BACKWARD_BRANCH are now implemented via dotnet/corert#5101 and dotnet/corert#5113 respectively.

filipnavara commented 5 years ago

However no real implementation seem to be requiring this. We should review whether the requirement should be dropped from the spec.

For cross reference: Mono currently requires that, see https://github.com/dotnet/corefx/pull/39765.

AndyAyersMS commented 5 years ago

Also see https://github.com/dotnet/coreclr/issues/14492 for an example where different single pass implementations may arrive at different answers for types.

Dotnet-GitSync-Bot commented 4 years ago

I couldn't figure out the best area label to add to this issue. Please help me learn by adding exactly one area label.