microsoft / ConcordExtensibilitySamples

Visual Studio Debug Engine Extensibility Samples
Other
122 stars 50 forks source link

Query expressions and use of temporary locals in real func-eval #49

Closed sae42 closed 5 years ago

sae42 commented 5 years ago

It's difficult to provide a repro for this, but I'm hoping a debugger expert may have some insight into this problem.

Say, I have a generated query expression that requires the use of a temporary local. In my case this is a byte[]. I pass this into two methods that exist in the debuggee process. The first method will update the individual bytes in the byte[]. However, it seems that the second method does not get the data updated by the first. This only appears to happen if using a temporary local - my experiments show that if I use a real local the func eval succeeds. Interesting if I force execution by the VIL interpreter (applying , emulator format specifier) the evaluation also returns the correct IL. The func eval will also succeed if the query expression and methods used return the byte[] rather than just pass it by value. Does anyone know whether this is a bug or just something that won't work?

Here's some example IL which I hope illustrates this:

class public '<>x' { .method public hidebysig static string '<>m0'() { .locals init (uint8[] _REAL_LOCAL_0) // Real data in user program .locals init (object _REAL_LOCAL_1) // Real data in user program ldc.i4.6 newarr uint8 // Constructed for use by the Query Expression .locals init (uint8[] _TEMP_LOCAL_2) // Added as a temp for use by query expression stloc.2 ldarg.0 ldfld int16 'num_00000225' ldloc.2 // If instead of this I use REAL_LOCAL_0, the query works correctly ldc.i4.0 ldc.i4.5 call void 'StoreInByteArray'(int32, uint8[], int32, int32) // When doing a real func-eval, the byte[] is modified by this method, but the updated data never returned in this temp byte[] ldloc.2 ldc.i4.0 ldc.i4.6 call string 'FormatAsString'(uint8[], int32, int32) // This just has a sequence of null bytes ret .maxstack 4 }

plnelson commented 5 years ago

Hello Simon, This is a known problem with mixing IL interpretation and real func eval. The quick workaround for this as you discovered is to have a real array value in the process being debugged. This could be a real local that has been initialized in the process. The key thing is the array value must exist in the process. The variable being a temporary or real variable doesn't matter. Two other solutions are to call a function that creates an array and then assigning this value to your temporary local. You may find a framework Array.Create variant that works, but I wouldn't suggest doing that because because the interpreter usually special cases these so such a solution would be fragile. The other solution is to create a pseudo-local and use that value as the parameter to StoreInByteArray.

TL;DR - Here's the details of what's happening if your interested: If the IL for the query method has a call instruction, the debugger will do a "real func eval" rather than use the interpreter. This means we set the context of the current thread to the beginning of the function being evaluated, set a guard breakpoint at the return address, then allow the process to run until the guard breakpoint is hit. That's a simplification, but it's the general idea. If the function being evaluated takes parameters, we need to marshal the values from the interpreter to the process being debugged. If the value already exists in the process, this is a simple task. If the value doesn't exist, we need to create it within the process. Due to the architecture of the interpreter, we can't update the value itself to this marshalled value. This is fine for immutable values. However, if the value is mutable, changes made by the process won't be visible to the interpreter.

Looking at your example (with added comments):

.method public hidebysig static string '<>m0'()
{
<snip>
newarr uint8 // Value is initialized inside the interpreter
.locals init (uint8[] _TEMP_LOCAL_2)
stloc.2 // The value that exists only in the interpreter is assigned to a temporary local

<snip>
           // Before the func eval, parameter 2 (an array value) is marshalled into a
           // real array value in the process being debugged
call void 'StoreInByteArray'(int32, uint8[], int32, int32)
           // StoreInByteArray has modified a value in the marshalled array value.
ldloc.2
<snip>
           // Local 2 is still the value that exists only in the interpreter and hasn't
           // been modified by any interpreted code
           // Parameter 1 is marshalled again to a real array value in the process
           // being debugged.
call string 'FormatAsString'(uint8[], int32, int32) // This just has a sequence of null bytes
ret
}

This is something I want to fix at some point, but in the meantime, you can work around this be ensuring the value is initialized within the process being debugged then assigning it to a variable the interpreter knows about. Also pseudo-locals force creation of array values in the process being debugged to work around this same problem.

Hopefully that helps. Let me know if you have any other questions

sae42 commented 5 years ago

Thanks for the explanation Patrick, that makes sense. I'll have a think about a workaround; I'd forgotten about the pseudo-locals which looks like it may be the best bet.

plnelson commented 5 years ago

A problem with pseudo-variables that occurred to me later is that they are visible to the user. There's also no way to delete them once they are created. That leaves calling a function in the process to create the array. If your runtime has a function to create arrays, I would try using that.