castleproject / Core

Castle Core, including Castle DynamicProxy, Logging Services and DictionaryAdapter
http://www.castleproject.org/
Other
2.22k stars 471 forks source link

TypeLoadException or BadImageFormatException under heavy multi-threaded proxy generation #193

Open bluerobotch opened 8 years ago

bluerobotch commented 8 years ago

We use Moq 4.5.10 with Castle.Core 3.3.3 and xUnit 2.1.0 for our unit tests. For xUnit we have the option activated to only have one app domain for all tests. We also enabled parallelization of test execution in xUnit.

Sporadically we get TypeLoadExceptions like the following example: System.TypeLoadException : Could not load type 'Castle.Proxies.Invocations.IExecutorAccessor_Prepare' from assembly 'DynamicProxyGenAssembly2, Version=0.0.0.0, Culture=neutral, PublicKeyToken=a621a9e7e5c32e69'. at Castle.Proxies.IExecutorAccessorProxy.Prepare[TExecutionUnit](TExecutionUnit executionUnit) ....

We reviewed the source code of Moq to ensure there is no threading issue with the ProxyGenerator caching. But Moq looks fine.

When disabling xUnit test parallelization we don't get TypeLoadExceptions anymore. For this reason we think it is a race condition in Castle.Core or even in the .Net Framework.

As far as we see, the missing type is the type that is created for the invocation by InterfaceProxyWithoutTargetContributor.GetInvocationType. The TypeLoadException seems to be raised in the code that is emitted for the invocation in MethodWithInvocationGenerator.BuildProxiedMethodBody. The invocation type is built before it is used in the emitted code. Therefore we do not see a reason why the type cannot be loaded.

jonorossi commented 8 years ago

@bluerobotch since you are the only one with a repro and DynamicProxy has been used in a lot of multi-threaded web apps for many years this might not be too easy to track down. Could you pull out fuslogvw to see if you can get any more info on the fusion failure. Thanks.

bluerobotch commented 8 years ago

@jonorossi thanks for the quick reply. We already tried to get more information by enabling the fusion log. But unfortunately the fusion log does not contain any records concerning the above problem. We also enabled the castle core log, but it did not provide further information. It looks like it's a type loading issue and not a problem with locating the assembly. During investigation we had the TypeLoadException in one of two identical test were the first one failed and the second one succeeded (xunit theory). The first test (failing) created the proxy type and the invocation type. The second one (succeeding) took both types from the cache. We tried to reproduce our behavior by generating 1000 unit tests each containing a theory and using one of 1000 interfaces to be proxied by moq. Then we run these 1000 test in parallel, but we never got an TypeLoadException so far. Do you have any idea in which direction we should investigate further? Is there a possibility to get (payed) help from the castle project team for example by reviewing our code during a remote session?

jonorossi commented 8 years ago

@bluerobotch could you include the TypeLoadException's full a stack trace. Looking through the issue trackers of the 3 main mocking libraries I found https://github.com/moq/moq4/issues/246, does that issue sound about the same?

Schaeri commented 8 years ago

Hello @jonorossi

Thanks for the fast answer and support. I created the issue on moq. Yes it is the same problem described in moq/moq4#246. @bluerobotch and I work on the same project. Together we trying to find the root cause of the issue since february without any luck. Locking into the code of castle and moq.

You can use the stack trace of the issue moq/moq4#246. Its never the same test but as an example trance it should work for the TypeLoaderException.

jonorossi commented 8 years ago

@bluerobotch @Schaeri to narrow things down could you provide the exact version of Moq you are using so we are looking at a single version of Castle Core. Also have you got any other use of Castle DynamicProxy via other mocking libraries or something else?

jonorossi commented 8 years ago

@bluerobotch @Schaeri scratch most of that, I see you've included the Moq version in this issue. The question about other usage is still relevant.

Schaeri commented 8 years ago

Yes we have other usage of castle core inside our project. We need castle core for our WCF infrastructure and to extend our dependency inject framework. This infrastructure is also covered in some unit tests. Other frameworks and libraries we use doesn’t depend on castle core. Moq ist the only one.

For a test I eliminate our castle core dependency in our project and run our unit test again. The error currently not occurred. But I have to run the tests over night to see if this is really an issue.

What can cause this error when we also use castle core inside our project? Or what does we do wrong when using castle core?

jonorossi commented 8 years ago

@Schaeri at the moment I have no idea, just trying to guess things that might be different with your project which is somehow unique with no other reports of this problem.

There might be a defect in DynamicProxy when two ProxyGenerators are used at the same time and this is only surfacing because of how they are set up during unit tests.

Let me know the results of that overnight unit test run, and anything else your project is doing that you think might be out of the ordinary (e.g. mocking COM interfaces).

Schaeri commented 8 years ago

Unfortunately the error still occurs. Currently I have no clue what we do special that could provoke such an error. @bluerobotch and I will sit together and rethink the situation. Maybe we have an other idea what could be special about our project.

jonorossi commented 8 years ago

@Schaeri no worries, at least it narrows it down by exclusion. Unfortunately I'm pretty swamped at the moment otherwise I'd have accepted your request for a remote session. I assume since you have been looking at this since February you have a workaround, maybe running your tests without parallelisation?

Schaeri commented 8 years ago

No at the moment we haven’t a workaround. When we switched from xunit 1.x to 2.x we did a lot of performance optimizations to run our tests as fast as possible. Execute our tests synchronous means we have to adapt our whole project structure to reach the same build and integrations times.

But yesterday I was able to reproduce one of our issue we have with castle on a very simple example. The attached example contains 1001 interfaces and 1001 test class. Each test class mocks a single interface and setup the DoSomething method on it. Then it calls the method and make sure it returns the setup value. This is done in a theory with the value true and false. The Run.bat configure the xunit runner with the parameters we use in our project (run test classes in parallel) and repeat the execution of the test until they fail.

They will fail with a BadImageFormatException thrown inside castle.core (see screenshot). To reproduce the error, the test must run 1 to 8 hours. But it will happen every try. The Run.bat can be started more than one time to increase the chance to get the error faster.

We have to deal in most cases with the TypeLoaderException but the BadImageFormatException occurs also sometime. And I think both exception have the same origin. Maybe @jonorossi you find time to run the example and see the error which will occur. And thanks a lot for the support, we really appreciate it.

CastleProblem.zip badimageformatexception

jonorossi commented 8 years ago

@Schaeri great to hear you've been able to put together a repro. I've had it running now on my Windows VM for just over 8 hours and it hasn't failed. I'll run another few copies to try to push it to fail. What is your machine and build server set up, i.e. how many CPUs, are they virtualised.

jonorossi commented 8 years ago

@Schaeri could you also let me know what version of the .NET Framework you are running. Follow this page to find out exactly. My version is 4.6.1 for non-Win10 (394271).

jonorossi commented 8 years ago

@Schaeri just got a BadImageFormatException! Different message which definitely indicates something screwy.

System.BadImageFormatException : Could not load file or assembly 'DynamicProxyGenAssembly2, Version=0.0.0.0, Culture=neutral, PublicKeyToken=a621a9e7e5c32e69' or one of its dependencies. Index not found. (Exception from HRESULT: 0x80131124)

Schaeri commented 8 years ago

Currently we run our tests in a loop on our developer notebooks for analyze the issue and for increasing the chance for getting the error. Intel i7 with 4 cores with hyper threading. The OS is a windows 7 64 bit and the .net framework is 4.6.1 (394271). The errors on our developer notebooks and build servers seams the same. But our build agents running on VMware vSphere 5.5 on Intel Xeon CPUs. The framework is the same of our build agents.

I run the sample a lot and get every time the exception on the screenshot (at least 10 times). But on our build server I saw also the error System.BadImageFormatException : Could not load file or assembly. The problem seems to have different characteristic.

Schaeri commented 8 years ago

@jonorossi we saw that you have labeled our issue as bug. Can we somehow contribute to help resolving the bug? And again, we appreciate your help and fast response.

jonorossi commented 8 years ago

@Schaeri I haven't actually confirmed it is a bug in DP rather than the .NET runtime (i.e. still don't know what causes it), however I could reproduce it. The next week for me is going to be busy so I won't be able to look into it, we need to get to the bottom of the problem obviously before we can fix it.

ghost commented 7 years ago

Guys, just an update on what I found when I tried to parallelise the tests for castle core using win10-x64(host install) doing multi framework targeting.

The modulescope class was kicking out errors because of I/O collisions between threads when saving strong named assemblies. The reason is because it uses fixed file names.

I hacked this locally to do dynamic assembly naming locally. After that multi threaded proxy gens worked and tests started passing.

The trade off unfortunately, is that you cannot leverage friend assemblies easily that are strongly named by using internals visible to attributes.

Unless this has been solved, you might want to create a test that batters ModuleScope in a multi-threaded context calling SaveAssembly(true). My 2 cents anyway...

Schaeri commented 7 years ago

Hello @fir3pho3nixx. Can you attach a patch for your modulescope class fix. I would like to check if this would solve our problem we have. Thanks for the help.

ghost commented 7 years ago

Here is my monkey patch for this:

Edited/Removed for TL;DR

Schaeri commented 7 years ago

Thanks for the patch. I applied your changes and build castle.core and Moq with the modifications. I include the updated package into my previous posted CastleProblem project and tested the change now for over a week. Its sad but I still get the type loader exceptions from time to time. With and without modification the occurrence seems to be the same of the problem. Any other idea what can cause the problem? Thanks for the support.

ghost commented 7 years ago

Will download Castle Problem zip file and start digging around to see if I can replicate this.

Schaeri commented 7 years ago

Thanks for the support. The error is very rar. The test must run for several hours until it will occur. If you need any help let me know.

ghost commented 7 years ago

Just an update on this issue, I raised a new one here: https://github.com/castleproject/Core/issues/253

They could be related or not. Would like to check this out first before I come back to this. Will let you know what the outcome is.

Schaeri commented 7 years ago

Thanks for the support. Yes let me know what the outcome is. Or when you have a patch/pre-realease to do further testing.

ghost commented 7 years ago

@Schaeri - We just fixed #277, are you guys using any modopt/modreq modifiers?

jonorossi commented 7 years ago

We just fixed #277, are you guys using any modopt/modreq modifiers?

I don't think custom modifiers would cause this problem, that was a consistent bug that would occur even with just one proxy type, this painful error occurs under heavy load even with basic proxy types.

I think we need to try to isolate whether the problem is something we can actually find/fix, it is likely a problem in the CLR. Putting together a test case that you can run for hours using just the reflection emit BCL types would probably be the next best step.

Schaeri commented 7 years ago

No we don't use any modopt/modreq modifiers. I think the issue 277 is not related to our problem.

The error 307 seams to go into the same direction. Interesting to know if the people with Nunit run into this problem also on the .Net Framework 4.5.2 or later. If it has something to do with the CLR maybe the error would not occur on the dotnet core.

I could try to bring my initial attached sample project to the dotnet core on a linux machine and run it there for several hours, days and check if this causes also the problem. But I don't know if this helps you analyze the error.

ghost commented 7 years ago

We should elaborate on the 'hammer' test, and create a way of exporting crash memory dumps. This way we can analyse the call stack.

Are you willing to add something like this to you're test? https://stackoverflow.com/questions/1134048/generating-net-crash-dumps-automatically

I would have done it sooner but I have been helping out with the netcore release of Windsor :/

Schaeri commented 7 years ago

I tried to generate a dump when the application crash with the debug diagnostic tools. But I was not lucky. The crash of the application was not automatically detected from the tools and also when I configure the exception by hand I was never able to generate a dump. I don’t know why, I use these tools to rarely. Maybe I have to try it with ProcDump or ADPlus to get a better result. But for this weekend I have to postpone this task.

Maybe somebody else can try it also with the tools. I took my previous attached sample solution for reproducing the error.

What I have attached to this comment are two log files containing the BadImageFormatException collected by the debug diagnostic tools two times during test execution. Maybe this one is interesting. The stack trace should be included in the log file.

xunit.consolePID7120Date07_08_2017__Time_06_25_20PM311Log.txt xunit.consolePID12804Date07_08_2017__Time_08_06_55PM736Log.txt

stakx commented 7 years ago

Hi @Schaeri, I'm visiting from https://github.com/moq/moq4/issues/246. Just letting you know that I'm looking into this, too. I haven't yet managed to reproduce the error with your repro code, despite running it for many hours, but once I do, I'll post any details / memory dumps that I can grab here.

Schaeri commented 7 years ago

@stakx. I run the test on several machine on a windows 7 64-bit or windows 10 64-bit system with 4 cores. I start the test up to 8 times to let it fail faster. I always try to get 100% CPU usage when I run the tests. But it take a lot of time. Normally I run it over a weekend on my machine and it will fail for sure once or twice. But It take really a lot of time. I hope you are also to be able to reproduce the error... at least once.

On 18.07.2017 08:54, stakx wrote:

@Schaeri https://github.com/schaeri, I have been running your test code on another machine, that is, an 8-core 64-bit Windows 7 machine with .NET Framework 4.6.1 (394271). This matches your own setup. After more than 16 hours of running the test 4 times in parallel, the error still hasn't surfaced.

I have however modified your code slightly: Instead of referencing Moq and Castle.Core as NuGet packages, I have included their source in your solution and compiled everything myself (in the hope that if the error happens, I'd get more accurate stack traces). I'm compiling and running a Debug build.

I'll now repeat the test run in the Release configuration and see if that makes a difference.

stakx commented 7 years ago

@Schaeri, I actually managed to repro the error earlier today, 4 times all at once. Unfortunately, the setup with Run.bat and the xUnit.net test runner means that by the time you see the exception message on the console, the process has already terminated, making it impossible to capture a minidump.

It occurred to me that it might be helpful to rule out both Moq and xUnit.net as possible error sources.

These replacements aren't exact equivalents, but that doesn't matter. All that's important is whether it makes the errors go away in the time window that you were previously looking at. Getting rid of xUnit.net and/or Moq will likely make your program more efficient, so if the error is still there (in the framework), it should happen sooner, not later.

I'm now running a modified version of your CastleProblem solution that has no xUnit.net (but still uses Moq). It's a console application with an infinite loop that creates batches of 1,000 tasks each using Parallel.For. The CPU is at 100% constantly. I'll let you know about the result as soon as I get back to the office.

PS: I also witnessed a single chance ExecutionEngineException which was very likely caused by the xUnit.net test runner; that's why I have become a bit suspicious of it. Could be that both Moq and Castle.Core work just fine, but there must be a lot of threading / appdomain-related stuff in the test runner.

Schaeri commented 7 years ago

Your absolutely right to try to get rid of xunit and moq for the test. There is for sure a lot of code which could have a influence that this error can occur.

I already tried a year ago to run tests in a parallel for without xunit. We were never lucky to run into the problem. I think our many problem was that the application startup only runs once. Our guess was that the error occurs only when the dynamic assembly was generated. This was also the main reason for the bat file which run the tests in a loop. To run as many startups as possible. And do this on several cores the same time.

When the error not occurs in the parallel for test. My fist adaptation of the program would be to do the loop also outside to get more application startups and generation of dynamic assembly. And decrease the loop inside to make the test run as fast as possible.

I will also invest some time this weekend with my sample program. I will start removing moq, then xunit and see when the error will no longer occur.

Thanks for the help. I am feeling no longer so alone with this problem.

stakx commented 7 years ago

Thanks for the additional pointers, I'll take that into account. Perhaps I'll start by throwing away my Parallel.For experiment, restart with your code, and take Moq out of the equation, just to have one fewer moving part. Should be fairly easy. Next, I'll try to get rid of the xUnit.net test runner, perhaps with an infinite-loop console app that makes use of AppDomains (even though you disabled them with the test runner!).

BrunoJuchli commented 7 years ago

@jonorossi @jonorossi I work with @Schaeri - I'm new on his team - and today I ran some more tests. First, I ran the original repro from @Schaeri: running the batch 12x concurrently for 30 minutes, 5 ended in a failure.

Next I created a second project - same source code but with updated nuget dependencies. Running 16x batches concurrently for 4 hours, only one died - this time the console reported an ExecutionEngineException. I've configured WIndows Error Reporting to create a (full) dump for xunit.console.exe, and this reports the following exception:

Unhandled exception at 0x00007FFAE2D3899C (clr.dll) in xunit.console.exe.9116.dmp: 0xC0000005: Access violation reading location 0x0000000000000010.

[Managed to Native Transition] Moq.dll!Moq.Mock.Setup(System.Linq.Expressions.Expression<System.Func<CastleProblem.IClass755,bool>> expression) ReproVS2017.dll!CastleProblem.Test755.DoSomething(bool value) Line 20

Now, of course, I don't know whether this is actually caused by the same issue...

I'll upload the "new" repro code and full dump ASAP:

stakx commented 7 years ago

@BrunoJuchli: One thing I noticed is that Moq versions newer than the one originally used here have significantly worse overall performance. (I'm looking into ways to reverse this trend, but that's a different story.) This could explain, at least in part, why you're seeing a much lower error rate after an upgrade to a later NuGet package versions... it's likely Moq spends much more time compiling LINQ expression trees rather than calling into DynamicProxy.

I've also run several different experiments but almost given up on it. I think it would be helpful to compile the repro code against Moq's and Castle Core's source (full debug symbols), then try to grab a minidump when a crash happens. Problem is, due to executing the process from a batch file, by the time you see the error message, the process is already gone. So one would possibly have to look more closely at what the xUnit test runner is doing in detail.

I haven't gone that far in my experiments. What I tried instead is to take the xUnit test runner completely out of the picture and have an infinite restart loop right in the repro code solution (with & without creating a new app-domain for each iteration) but then the problem was no longer reproducible.

I've also taken Moq out of the picture, that also made the problem go away. So we're in an unfortunate situation where the problem is only easily reproducible when three components (xUnit, DynamicProxy, and Moq) interact.

So in conclusion I think what's really needed is a minidump, perhaps it would be possible to spawn the xUnit test runner from a parent process that automatically captures a minidump if the xUnit child process terminates abnormally.

BrunoJuchli commented 7 years ago

@stakx I've got a dump. Just had some problems uploading it - the file is 60+ MiB. But here it is:

https://drive.google.com/open?id=0Bw_nvzYyrfFSeHJLaDZfSlRCclk


One can configure WER to create the dump on app crash: https://msdn.microsoft.com/en-us/library/windows/desktop/bb787181(v=vs.85).aspx

(you need a "full" dump with heap memory to be able to look into .net objects).

stakx commented 7 years ago

@BrunoJuchli: Great to hear! I also managed to get some dumps but mine weren't helpful at all. I didn't try the WER thingy back then because I was under the impression that it didn't actually support .NET processes (the documentation mentions something in that direction).

Looking forward to those 60 MiB. :+1

BrunoJuchli commented 7 years ago

@stakx I've uploaded the dump (see here).

I let the tests run overnight. 8 "new references", 8 "old references". All of the "new" ones survived, no issues. However, 8 out of 8 of the old ones, died. All of them with the BadImageFormatException. None of the captured by WER. I was being stupid, anyway. The exception get's "handled" by the xunit runner which reports it as a test result. So the runner doesn't actually crash - nothing for WER to do. So i'll have a look into getting a dump at that time - should be easy with Debugger.Launch().

BrunoJuchli commented 7 years ago

Re the "old" assemblies, i've got a dump, but there's nothing new that I personally can see. with these lines of code:

this.aClass = new Mock<IClass55>();
this.aClass.Setup(x => x.DoSomething()).Returns(value);
Assert.Equal(this.aClass.Object.DoSomething(), value);

It is the last line which produces a failure - the Setup works, but actually executing this.aClass.Object.DoSomething() (sporadically) fails.

The BadImageFormatException has a message "The signature is incorrect." and it says Source = "DynamicProxyGenAssembly2".

I can't seem to find any more useful information - I'm happy to provide the dump if someone else wants to have a look at it.

My take is, that it sporadically there's a hick-up with code generation. For further investigation, since the exception is not very helpful, we should get the "DynamicProxyGenAssembly2" dll - contains the generated code, right? Any pointers on accessing it? Does it exist on the disk? Where is it written to?...

Thanks.

stakx commented 7 years ago

@BrunoJuchli - re:

It is the last line which produces a failure - the Setup works, but actually executing this.aClass.Object.DoSomething() (sporadically) fails.

That's because the first call to a mock's .Object property will trigger the proxy generation, i.e. the call to Castle DynamicProxy's ProxyGenerator.CreateClassProxy. (Pretty much every other Moq call stays inside Moq, i.e. configures / queries Moq's own proxy interceptor objects.)

For further investigation, since the exception is not very helpful, we should get the "DynamicProxyGenAssembly2" dll - contains the generated code, right?

I'd be happy to provide a debug-only version of Moq that tries to catch a BadImageFormatException (or whatever other exceptions might get thrown) and tells DynamicProxy's ProxyGenerator to save its dynamic proxy DLL. I just haven't checked yet whether both of these things are in fact possible. (Some exceptions cannot be caught, and perhaps the framework will resist attempts to save a truly corrupt DLL to disk.) I could possibly look into this sometime later today.

BrunoJuchli commented 7 years ago

@stakx As I'm not sure whether the failure I got with the new versions of castle.core is the same I'm not sure whether it's worth it to invest more time (yours and mine) into this potentially old bug. If, however, we find that in the new version the problem still remains we could potentially use the old version to pinpoint the bug (as with the old version the problem occurs much more often).

At the moment I would suggest to leave it be. At the moment we believe the issue has been fixed, hence we'll be looking into parallelizing the tests again. If we would see that there's still sporadic failures we would re-start investigation into this issue.

Do you concur that this is a good approach? If you strongly believe the issue still remains but has been masked by new factors then I'd be willing to dedicate some of my non-work to help finding the issue.


Nonetheless, if this is ever going to be further analyzed: Do I understand correctly, that the generated assembly remains in memory and is loaded from there? If so, as we've got a full memory dump, It'd potentially be possible to extract the "object" to disk. Might result in missing headers or similarly, though.

If that's not feasible, I believe I'm probably capable of creating my own debug-version of Moq which writes the DLL to disk. The 'BadImageFormatException' can be cought. Once on disk, PEVerify might be used to verify the assembly. Maybe also try out the new ILVerify (not done yet).

Btw., if the framework resists to save a corrupt DLL to disk, then the exception provided might provide more insight :-)

stakx commented 7 years ago

If, however, we find that in the new version the problem still remains [...]

The originally reported error can still be reproduced with the latest versions of Moq and DynamicProxy. (I haven't uploaded my own test code since the NuGet update is trivial, but I have verified it.)

[We] could potentially use the old version to pinpoint the bug (as with the old version the problem occurs much more often).

Agreed.

I'm not sure whether it's worth it to invest more time (yours and mine) into this potentially old bug. [...] At the moment I would suggest to leave it be.

Agreed. I am intrigued by puzzles such as this issue, but the potential payoff is probably too small compared to the work we're putting into it.

If you do stop investigating this issue because you've got your own test code to work reliably, then I'd suggest that both this issue here and the one over at the Moq repo be closed... at least for the time being. Noone else has claimed to have the same problem. The issues can always be reopened if needed.

I believe I'm probably capable of creating my own debug-version of Moq which writes the DLL to disk. The 'BadImageFormatException' can be cought.

... unless the JITter has done some unfortunate inlining. Probably safer to recompile everything in a non-optimized debug configuration.

Some quick pointers if you do want to give it a go:

jonorossi commented 7 years ago

@stakx @BrunoJuchli it's great to see some more work on this issue, I don't think I'll be much help on this problem but let me know if there is anything I can assist with.

Yes, you can use ModuleScope.SaveAssembly.

BrunoJuchli commented 7 years ago

I'm currently running the tests. 16 starts for 30+ minutes did not lead to a single failure. Before they were happening more often.

I guess the file should be written to the "current directory" / executing directory, right?

We'll see what happens when 10 fail and all of them try to write the same file :D

IFluentInterface was missing from the source code. As I just downloaded the Zip file no PR. Here the missing file, though:

namespace Moq
{
    /// <summary>
    /// 
    /// </summary>
    [EditorBrowsable(EditorBrowsableState.Never)]
    public interface IFluentInterface
    {
        /// <summary>
        /// 
        /// </summary>
        /// <returns></returns>
        [EditorBrowsable(EditorBrowsableState.Never)]
        Type GetType();
        /// <summary>
        /// 
        /// </summary>
        /// <returns></returns>
        [EditorBrowsable(EditorBrowsableState.Never)]
        int GetHashCode();
        /// <summary>
        /// 
        /// </summary>
        /// <returns></returns>
        [EditorBrowsable(EditorBrowsableState.Never)]
        string ToString();
        /// <summary>
        /// 
        /// </summary>
        /// <param name="obj"></param>
        /// <returns></returns>
        [EditorBrowsable(EditorBrowsableState.Never)]
        bool Equals(object obj);
    }
}
BrunoJuchli commented 7 years ago

I ran the modified repro concurrently with 16x runs for for 16 hours.

Sidenote: Of those 7, when terminating with "CTRL+C", 2 of those failed to stop the xunit console runner. I haven't had time to analyze the results, yet.

  System.InvalidOperationException : Unable to alter assembly information.
  Stack Trace:
       at System.Reflection.Emit.AssemblyBuilderData.AddPublicComType(Type type)
       at System.Reflection.Emit.TypeBuilder.Init(String fullname, TypeAttributes attr, Type parent, Type[] interfaces, ModuleBuilder module, PackingSize iPackingSize, Int32 iTypeSize, TypeBuilder enclosingType)
       at System.Reflection.Emit.ModuleBuilder.DefineType(String name, TypeAttributes attr)
    Castle.Core-3.3.3\src\Castle.Core\DynamicProxy\ModuleScope.cs(569,0): at Castle.DynamicProxy.ModuleScope.DefineType(Boolean inSignedModulePreferably, String name, TypeAttributes flags)
    Castle.Core-3.3.3\src\Castle.Core\DynamicProxy\Generators\Emitters\ClassEmitter.cs(97,0): at Castle.DynamicProxy.Generators.Emitters.ClassEmitter.CreateTypeBuilder(ModuleScope modulescope, String name, Type baseType, IEnumerable`1 interfaces, TypeAttributes flags, Boolean forceUnsigned)
    Castle.Core-3.3.3\src\Castle.Core\DynamicProxy\Generators\Emitters\ClassEmitter.cs(43,0): at Castle.DynamicProxy.Generators.Emitters.ClassEmitter..ctor(ModuleScope modulescope, String name, Type baseType, IEnumerable`1 interfaces, TypeAttributes flags, Boolean forceUnsigned)
    Castle.Core-3.3.3\src\Castle.Core\DynamicProxy\Generators\Emitters\ClassEmitter.cs(30,0): at Castle.DynamicProxy.Generators.Emitters.ClassEmitter..ctor(ModuleScope modulescope, String name, Type baseType, IEnumerable`1 interfaces)
    Castle.Core-3.3.3\src\Castle.Core\DynamicProxy\Generators\BaseProxyGenerator.cs(129,0): at Castle.DynamicProxy.Generators.BaseProxyGenerator.BuildClassEmitter(String typeName, Type parentType, IEnumerable`1 interfaces)
    Castle.Core-3.3.3\src\Castle.Core\DynamicProxy\Generators\InterfaceProxyWithTargetGenerator.cs(246,0): at Castle.DynamicProxy.Generators.InterfaceProxyWithTargetGenerator.Init(String typeName, ClassEmitter& emitter, Type proxyTargetType, FieldReference& interceptorsField, IEnumerable`1 interfaces)
    Castle.Core-3.3.3\src\Castle.Core\DynamicProxy\Generators\InterfaceProxyWithoutTargetGenerator.cs(67,0): at Castle.DynamicProxy.Generators.InterfaceProxyWithoutTargetGenerator.GenerateType(String typeName, Type proxyTargetType, Type[] interfaces, INamingScope namingScope)
    Castle.Core-3.3.3\src\Castle.Core\DynamicProxy\Generators\InterfaceProxyWithTargetGenerator.cs(64,0): at Castle.DynamicProxy.Generators.InterfaceProxyWithTargetGenerator.<>c__DisplayClass6_0.<GenerateCode>b__0(String n, INamingScope s)
    Castle.Core-3.3.3\src\Castle.Core\DynamicProxy\Generators\BaseProxyGenerator.cs(418,0): at Castle.DynamicProxy.Generators.BaseProxyGenerator.ObtainProxyType(CacheKey cacheKey, Func`3 factory)
    Castle.Core-3.3.3\src\Castle.Core\DynamicProxy\Generators\InterfaceProxyWithTargetGenerator.cs(64,0): at Castle.DynamicProxy.Generators.InterfaceProxyWithTargetGenerator.GenerateCode(Type proxyTargetType, Type[] interfaces, ProxyGenerationOptions options)
    Castle.Core-3.3.3\src\Castle.Core\DynamicProxy\DefaultProxyBuilder.cs(112,0): at Castle.DynamicProxy.DefaultProxyBuilder.CreateInterfaceProxyTypeWithoutTarget(Type interfaceToProxy, Type[] additionalInterfacesToProxy, ProxyGenerationOptions options)
    Castle.Core-3.3.3\src\Castle.Core\DynamicProxy\ProxyGenerator.cs(1593,0): at Castle.DynamicProxy.ProxyGenerator.CreateInterfaceProxyTypeWithoutTarget(Type interfaceToProxy, Type[] additionalInterfacesToProxy, ProxyGenerationOptions options)
    Castle.Core-3.3.3\src\Castle.Core\DynamicProxy\ProxyGenerator.cs(871,0): at Castle.DynamicProxy.ProxyGenerator.CreateInterfaceProxyWithoutTarget(Type interfaceToProxy, Type[] additionalInterfacesToProxy, ProxyGenerationOptions options, IInterceptor[] interceptors)
    Moq-4.5.10\Source\Proxy\CastleProxyFactory.cs(85,0): at Moq.Proxy.CastleProxyFactory.CreateProxy(Type mockType, ICallInterceptor interceptor, Type[] interfaces, Object[] arguments)
    Moq-4.5.10\Source\Mock.Generic.cs(210,0): at Moq.Mock`1.<InitializeInstance>b__24_0()
    Moq-4.5.10\Source\PexProtector.cs(56,0): at Moq.PexProtector.Invoke(Action action)
    Moq-4.5.10\Source\Mock.Generic.cs(188,0): at Moq.Mock`1.InitializeInstance()
    Moq-4.5.10\Source\Mock.Generic.cs(242,0): at Moq.Mock`1.OnGetObject()
    Moq-4.5.10\Source\Mock.cs(179,0): at Moq.Mock.GetObject()
    Moq-4.5.10\Source\Mock.cs(174,0): at Moq.Mock.get_Object()
    Moq-4.5.10\Source\Mock.Generic.cs(169,0): at Moq.Mock`1.get_Object()
    CastleProblem\Generated\Test245.cs(22,0): at CastleProblem.Test245.DoSomething(Boolean value)

  System.NotSupportedException : Type 'Castle.Proxies.IClass330Proxy' was not completed.
  Stack Trace:
       at System.Reflection.Emit.ModuleBuilder.PreSave(String fileName, PortableExecutableKinds portableExecutableKind, ImageFileMachine imageFileMachine)
       at System.Reflection.Emit.AssemblyBuilder.SaveNoLock(String assemblyFileName, PortableExecutableKinds portableExecutableKind, ImageFileMachine imageFileMachine)
       at System.Reflection.Emit.AssemblyBuilder.Save(String assemblyFileName, PortableExecutableKinds portableExecutableKind, ImageFileMachine imageFileMachine)
    Castle.Core-3.3.3\src\Castle.Core\DynamicProxy\ModuleScope.cs(497,0): at Castle.DynamicProxy.ModuleScope.SaveAssembly(Boolean strongNamed)
    Moq-4.5.10\Source\Proxy\CastleProxyFactory.cs(122,0): at Moq.Proxy.CastleProxyFactory.<CreateProxy>g__SaveDynamicProxyAssemblyToDisk2_0()
    Moq-4.5.10\Source\Proxy\CastleProxyFactory.cs(107,0): at Moq.Proxy.CastleProxyFactory.CreateProxy(Type mockType, ICallInterceptor interceptor, Type[] interfaces, Object[] arguments)
    Moq-4.5.10\Source\Mock.Generic.cs(210,0): at Moq.Mock`1.<InitializeInstance>b__24_0()
    Moq-4.5.10\Source\PexProtector.cs(56,0): at Moq.PexProtector.Invoke(Action action)
    Moq-4.5.10\Source\Mock.Generic.cs(188,0): at Moq.Mock`1.InitializeInstance()
    Moq-4.5.10\Source\Mock.Generic.cs(242,0): at Moq.Mock`1.OnGetObject()
    Moq-4.5.10\Source\Mock.cs(179,0): at Moq.Mock.GetObject()
    Moq-4.5.10\Source\Mock.cs(174,0): at Moq.Mock.get_Object()
    Moq-4.5.10\Source\Mock.Generic.cs(169,0): at Moq.Mock`1.get_Object()
    CastleProblem\Generated\Test170.cs(22,0): at CastleProblem.Test170.DoSomething(Boolean value)
stakx commented 7 years ago

1 Runner process "died" Get dump here

Seems that this time around, the TargetInvocationException was triggered by the xUnit.NET test runner, and not by Moq or DynamicProxy2. Execution didn't actually get that far.

Though only 1 wrote a DLL [...]

Great to hear it worked at least this far, though it's a pity that it wasn't possible to also get a minidump for the same process. Also, it seems my test code has some shortcomings. I'll get back to that below.

The written DLL looks OK at first glance, peverify has only one (repeated) complaint: The only issue with it is that it references some strong-named DynamicProxyGenAssembly2 DLL (which the test code doesn't write to disk). Not sure what is going wrong here.


@jonorossi, perhaps you could shed some light on this: When saving a generated proxy DLL to disk using ModuleScope.SaveAssembly, does my test code (here and here) need to save both the weak-named and the strong-named DLL? Why does the weak-named DLL that I'm saving reference a strong-named DynamicProxyGenAssembly2 DLL?


Further evidence that the written DLL is fine is that I can reference it in another test code solution in VS (but of course at run-time execution fails with a FileNotFoundException due to the missing DLL reference issue).

I'm currently running the tests. 16 starts for 30+ minutes did not lead to a single failure. Before they were happening more often.

I was wondering if that's maybe because we're executing code that has been compiled in debug mode, which might disable some JIT optimizations. I'll give it a go, too, today.

Though only 1 wrote a DLL, even though the PID seems to be attached to the file name - meaning there should not be multiple processes writing the same file.

Yes, I figured that adding the PID to the file name would make it far less likely that one test run overrides another's written DLL. It's not completely bullet-proof, but probably good enough for now.

1 Test failed with InvalidOperationException: Unable to alter assembly information. (see below) 1 Test failed with: NotSupportedException: Type Çastle.Proxies.IClass330Proxy' was not completed. (see below)

Yeah, when developing the test code, I noticed that such exceptions get thrown when DynamicProxy generates more proxy types after the DLL has already been written to disk... which is likely to happen, since several test executions run in parallel. (I didn't consider that, this needs improvement in the test code.)

jonorossi commented 7 years ago

When saving a generated proxy DLL to disk using ModuleScope.SaveAssembly, does my test code (here and here) need to save both the weak-named and the strong-named DLL?

Generally no, the proxy types will either get generated into the strong or weak named dynamic assemblies based on whether the types are in a strong named assembly. If you get rid of any strong naming and set disableSignedModule to true you'll be guaranteed that all types will be generated in the weak named assembly.

Why does the weak-named DLL that I'm saving reference a strong-named DynamicProxyGenAssembly2 DLL?

No idea, that seems odd. Do you have a link to one of those assemblies I can take a look at?

System.InvalidOperationException : Unable to alter assembly information.

Basically nothing online for this error, except that it a few people were getting it with debug builds of DynamicProxy a decade ago. Looking at the reference source this is thrown if the assembly has already been saved, saving seems to make it readonly. This is likely caused by your test code.

System.NotSupportedException : Type 'Castle.Proxies.IClass330Proxy' was not completed.

Nothing much online, but the reference source throws "NotSupported_NotAllTypesAreBaked" with the comment "cannot save to PE file without creating all of the types first", seems to indicate you are trying to save the assembly while in the middle of creating a new type in that assembly.

Sounds like some threading problems in your test code as @stakx mentioned in the last paragraph of last comment.

Are you guys using the newest .NET Framework 4.7? Just wondering if there might have been a JIT fix that might resolve the TypeLoadException/BadImageFormatException.

.NET's ModuleBuilder, TypeBuilder, etc classes have no guarantee of thread safety of instance methods like most of the BCL. Skimming the DP generators and emitters it appears that BaseProxyGenerator.ObtainProxyType is properly preventing problems emitting the same or multiple types concurrently. However ModuleScope has two locks, one just used for saving the assembly so looks like it incorrectly allows you to save an assembly while in the middle of generating a type, not really a problem for most applications.

Has anyone tried reproducing this on .NET Core? I'm asking because it appears the TypeLoadException/BadImageFormatExceptions are being thrown when the types are getting used not during type generation, which would be much easier to debug if we could debug the runtime and work out why it thinks there is a problem with the type.

stakx commented 7 years ago

@jonorossi:

Why does the weak-named DLL that I'm saving reference a strong-named DynamicProxyGenAssembly2 DLL? No idea, that seems odd. Do you have a link to one of those assemblies I can take a look at?

You can take a look at the DLL referenced in https://github.com/castleproject/Core/issues/193#issuecomment-329695743. AFAIK it's been generated by the test code mentioned (and linked to) in https://github.com/castleproject/Core/issues/193#issuecomment-329760632.

System.InvalidOperationException : Unable to alter assembly information. System.NotSupportedException : Type 'Castle.Proxies.IClass330Proxy' was not completed.

I wouldn't worry too much about these, my test code definitely doesn't explicitly deal with several tests running in parallel, and several of them generating more dynamic types when a failed test has already written the dynamic assembly to disk.

.NET's ModuleBuilder, TypeBuilder, etc classes have no guarantee of thread safety of instance methods like most of the BCL.

... and I guess the only way for DynamicProxy (and all that's built on top of it) to make any stronger thread safety guarantees than System.Reflection.Emit's is to have exclusive locks around all accesses to ModuleBuilder, TypeBuilder, etc. (I suppose BaseProxyGenerator or ModuleScope already do this?)

It probably wouldn't hurt to put an exclusive lock around the proxy generation call in Moq and see if that has any effect.

Has anyone tried reproducing this on .NET Core?

I don't think any of us have. But it's a good idea if we really want to solve this, I'm pretty sure we'll end up having to look into framework code to find out what exactly is going on here. I'm still hoping though that it won't be necessary. :)