Closed veikkoeeva closed 9 years ago
:+1: Great investigative work! I had suspected that exact line, but was unable to find the fix - well done! This should fix a bunch of gotchas.
@ReubenBond You might get a temporary fix by using AppDomain.AssemblyResolve and upon hitting the handler, use ApplyPolicy
and then return the new, redirected assembly. I'm not sure if this works, but it's worth a try. Also, if you happen to know how to test this particular fix, feel free to do a PR. It may take a day or two I'm back. I used my programming quota, kids are complaining they are hungry etc. :)
A link to Codeplex discussion about this.
Some quick further debugging on this. Putting AppDomain.AssemblyResolve
handler to one's own code doesn't get called. The reason is probably that it's already handler in AssemblyLoader line 141. As can be seen, the calls end up in CachedReflectionOnlyTypeResolver.
I did a quick try with the following (awful code, misleading printing, buggy etc., so take it with these warnings) and I could see there were assemblies with their old versions and they'd hit a breakpoint in the first return Assembly.LoadFrom(afterPolicy);
, but loading the assembly fails with a message there's already one loaded with the same name. I didn't investigate further, but the exception messages were about conflicting assemblies (an assembly with the same name but different version had been loaded already?).
I don't know the exact purpose with these classes and so I don't have a clear clue on where and how should one do the policy resolution. Perhaps @gabikliot, @jthelin or @sergeybykov could weigh in here. Or someone doing some brute force debugging. :) In any event, it looks like assembly redirections won't work before this is fixed (with a note to Nuget dependency definitions).
public static Assembly OnReflectionOnlyAssemblyResolve(object sender, ResolveEventArgs args)
{
// loading into the reflection-only context doesn't resolve dependencies automatically.
// we're faced with the choice of ignoring arguably false dependency-missing exceptions or
// loading the dependent assemblies into the reflection-only context.
//
// i opted to load dependencies (by implementing OnReflectionOnlyAssemblyResolve)
// because it's an opportunity to quickly identify assemblies that wouldn't load under
// normal circumstances.
var appDomain = (AppDomain)sender;
var afterPolicy = appDomain.ApplyPolicy(args.Name);
try
{
if(afterPolicy != args.Name)
{
return Assembly.LoadFrom(afterPolicy);
}
return Assembly.ReflectionOnlyLoad(args.Name);
}
catch (FileNotFoundException)
{
if (logger.IsVerbose2)
{
logger.Verbose2(FormatReflectionOnlyAssemblyResolveFailureMessage(sender, args));
}
var dirName = Path.GetDirectoryName(args.RequestingAssembly.Location);
var assemblyName = new AssemblyName(afterPolicy != args.Name ? afterPolicy : args.Name);
var fileName = string.Format("{0}.dll", assemblyName.Name);
var pathName = Path.Combine(dirName, fileName);
if (logger.IsVerbose2)
{
logger.Verbose2("failed to find assembly {0} in {1}; searching for {2} instead.",
assemblyName.FullName, dirName, pathName);
}
try
{
var afterPolicy2 = appDomain.ApplyPolicy(pathName);
if(afterPolicy2 != pathName)
{
return Assembly.LoadFrom(pathName);
}
return Assembly.ReflectionOnlyLoadFrom(pathName);
}
catch (FileNotFoundException)
{
if(logger.IsVerbose2)
{
logger.Verbose(FormatReflectionOnlyAssemblyResolveFailureMessage(sender, args));
}
throw;
}
}
}
Ah, as a tangential note, it feels this reflection code could do some refactoring provided its purpose is made more clear. <edit 1: With refactoring I was thinking something like
.AppyPolicy
, processing could be done, say, like in style of "file processing in LINQ", maybe even in parallel, caching and all applied.AppDomain
code could be simpler, perhaps there are even libraries for itApplyPolicy
should be tried in AssemblyLoader
as a first thing, caching moved to there and CachedReflectionOnlyTypeResolver
removed?).@veikkoeeva Great in-depth analysis. Thank you! I hope your kids didn't starve too badly. :-)
These are some of the murkiest code paths with assembly resolution not always working the way you expect. It doesn't help that Assembly.LoadFrom() has been recommended against for quite some time. We'll dig deeper into this.
No kids harmed in the process of debugging, it didn't take that a long time. :) As for general interest and reference Best Practices for Assembly Loading [MSDN].
The AssemblyLoader is also used for finding / loading the different types of provider plugins and not just grain classes, which may help to explain why it is more general-purpose than it might need to be.
When debugging these types of type loading problems, i have always found it important to check not just when .NET thinks is the "Name" of the assembly, but also the "Location" [aka Java "codebase"] that it is coming from. http://stackoverflow.com/questions/3623358/two-types-not-equal-that-should-be/3624044#3624044
Assembly.LoadFrom() could quite possibly be the most evil function in the whole .NET Framework! :(
@jthelin I suspect the reason for redirects to fail is that LoadFrom
loads the already loaded assembly even if a redirected one is provided. That is, it has already loaded one without applying any policies and later if the same assembly comes along elsewhere by some method (e.g. by applying a policy), an exception will be thrown that the assembly has been loaded already.
I saw this behaviour documented in MSDN, and various blogs. I see from the documentation that LoadFrom
should respect policies, so perhaps the fix is just call ApplyPolicy
as the first thing before doing the first LoadFrom
. If my memory serves me correctly, I remember seeing also loading to reflection only context, which has a different story. Junfeng's blog on 2004 has many posts regarding assembly loading, mentioning also AssemblyResolve
(using the provided Bing search engine to the blog may be useful).
From #200.
I found a related bug in the AssemblyLoader in reflection-only inspection. Assembly.GetTypes() throws if dependencies aren't already loaded. I suspect it is blocking some of these scenarios, and in a non-deterministic way because it depends on the order of loading. I'll have a PR with a fix for it.
I am still confused with assembly binding redirects. As I mentioned in #197, I see redirects working if they are in OrleansHost.exe.config in the local case. What's the gap here?
@sergeybykov It's good to hear you found something, especially since it looks like being a Heisenbug.
I didn't the mention of redirects working. Perhaps it was in other thread? This feels like a valuable piece of information, but how do I use it? I just tried to deploy OrleansHost.exe.config
to Azure Compute Emulator and I see OrleansHost.exe
, OrleansHost.exe.config
and, say, Microsoft.Data.OData
version 5.6.3 there and when the silo is starting, I get an exception Microsoft.Data.OData
or one of its dependencies could not be loaded due to it being either missing or not corresponding to the defined reference (*). In the exe.config
I have a redirect defined as follows
<dependentAssembly>
<assemblyIdentity name="Microsoft.Data.OData" publicKeyToken="31bf3856ad364e35" culture="neutral" />
<bindingRedirect oldVersion="0.0.0.0-5.6.3.0" newVersion="5.6.3.0" />
</dependentAssembly>
(*)
WaWorkerHost.exe Error: 0 : [2015-03-16 19:52:47.670 GMT 6 ERROR 100833 OrleansSiloInstanceManager ] !!!!!!!!!! INTERNAL FAILURE! About to crash! Fail message is: Exception trying to create or connect to the Azure table: Microsoft.WindowsAzure.Storage.StorageException: Tiedostoa tai kokoonpanoa Microsoft.Data.OData, Version=5.6.2.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35 tai jotakin sen riippuvuutta ei voi ladata. Löydetty kokoonpanon luettelomääritys ei vastaa kokoonpanon viittausta. (HRESULT-poikkeus: 0x80131040) ---> System.IO.FileLoadException: Tiedostoa tai kokoonpanoa Microsoft.Data.OData, Version=5.6.2.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35 tai jotakin sen riippuvuutta ei voi ladata. Löydetty kokoonpanon luettelomääritys ei vastaa kokoonpanon viittausta. (HRESULT-poikkeus: 0x80131040)
It's starting to be late, so perhaps I've glared over something obvious. When debuggin this, admitedly just a short while, I did see that also the newer versions were tried, but at that point the older ones were already loaded an the loader library threw an exception that the same assembly couldn't be loaded to the same context.
@veikkoeeva Let me put together the reflection-only fix. I have a crude one. I want to make it less ugly. Let's see if this error will go away with it. Or not.
@sergeybykov It will fix some problems, that's for sure, so it makes world a bit better place. :) I forgot to mention I'm using the Nuget packages currently if that factors into this. Also, I wonder, how big a task would it be to redo the loader so as to make it to a set of Funcs
that get a list of assemblies with some rules and then provide that list to a separate loader? I've understood Orleankka works like this. That feels like teh solution, though requires probably more planning.
From #200
I am still confused with assembly binding redirects. As I mentioned in #197, I see redirects working if they are in OrleansHost.exe.config in the local case. What's the gap here?
I've tested this with (95375c8). I've built the SDK from source and installed it. It works (in my scenario, at least), both in the Azure emulator and when deployed to Azure. I have not tested this using the Orleans nuget packages.
Context:
Orleans.dll
, OrleansRuntime.dll
and OrleansAzureUtils.dll
app.config
file (which, at deployment time gets renamed to `Microsoft.WindowsAzure.Storage
4.3, etc.). The following binding redirect section is what works in my the above scenario:
<assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
<dependentAssembly>
<assemblyIdentity name="Newtonsoft.Json" publicKeyToken="30ad4fe6b2a6aeed" culture="neutral" />
<bindingRedirect oldVersion="0.0.0.0-6.0.0.0" newVersion="6.0.0.0" />
</dependentAssembly>
<dependentAssembly>
<assemblyIdentity name="Microsoft.Data.Edm" publicKeyToken="31bf3856ad364e35" culture="neutral" />
<bindingRedirect oldVersion="0.0.0.0-5.6.3.0" newVersion="5.6.3.0" />
</dependentAssembly>
<dependentAssembly>
<assemblyIdentity name="Microsoft.Data.Services.Client" publicKeyToken="31bf3856ad364e35" culture="neutral" />
<bindingRedirect oldVersion="0.0.0.0-5.6.3.0" newVersion="5.6.3.0" />
</dependentAssembly>
<dependentAssembly>
<assemblyIdentity name="System.Spatial" publicKeyToken="31bf3856ad364e35" culture="neutral" />
<bindingRedirect oldVersion="0.0.0.0-5.6.3.0" newVersion="5.6.3.0" />
</dependentAssembly>
<dependentAssembly>
<assemblyIdentity name="Microsoft.Data.OData" publicKeyToken="31bf3856ad364e35" culture="neutral" />
<bindingRedirect oldVersion="0.0.0.0-5.6.3.0" newVersion="5.6.3.0" />
</dependentAssembly>
<dependentAssembly>
<assemblyIdentity name="Microsoft.WindowsAzure.Storage" culture="neutral" publicKeyToken="31bf3856ad364e35" />
<bindingRedirect oldVersion="0.0.0.0-4.3.0.0" newVersion="4.3.0.0" />
</dependentAssembly>
<dependentAssembly>
<assemblyIdentity name="Microsoft.WindowsAzure.ServiceRuntime" culture="neutral" publicKeyToken="31bf3856ad364e35" />
<bindingRedirect oldVersion="0.0.0.0-2.5.0.0" newVersion="2.5.0.0" />
</dependentAssembly>
@sergeybykov I am not sure why this has not worked for me before - I'm glad it works now and I think it means case 1) in #200 works as intended.
I am sure I was able to reproduce the binding redirect issue but I was testing the case where the grain assembly and its dependencies were in a subfolder. https://github.com/dotnet/orleans/pull/236 fixed it for me. In the process of investigating this I also found a strange issue with dependency resolution in the reflection-only context. I included a workaround for it in https://github.com/dotnet/orleans/pull/236.
@sergeybykov, @ca-ta Ah, I didn't have a build from the latest source, but the latest Nuget packages. Perhaps that the issue for me. If it works for @ca-ta, it works for me, I believe. <edit: would have had use for a fresh brain yesterday, after all. :) Thanks @sergeybykov and @ca-ta.
Now that I've downloaded the latest binaries from Nuget, I see I bumped the same issue @ReubenBond mentioned here: Orleans does not respect binding redirects. I should have paid more attention to that, sorry for being slow with this. :)
I see AssemblyLoader makes use of Assembly.ReflectionOnlyLoadFrom at line 277 and I suspect this is the reason (without actually debugging). According to the documentation this does not respect binding redirects. It would look like one needs to use AppDomain.ApplyPolicy to get this behavior. The fix would be simply following:
Currently the Nuget packages are marked to indicate either the packages Orleans references or newer are accepted, so this looks like a rather nasty bug. There are two solutions: 1) Fix Nuget specification so that only exact versions are allowed. 2) Allow applying policies.
<edit: I see there's also Assembly.LoadFile used in
AssemblyLoader
. I'm not sure if it respects redirects, but probably no harm would be done if policies would be applied.<edit 2: All right, this little fix may not work. I hunt down one prominent example: XRE Runtime issue #563 ([line 178](https://github.com/aspnet/XRE/blob/bb3cdbdb67ca04366fcb9a04e56d0dc05752debb/src/klr.hosting.shared/RuntimeBootstrapper.cs#L178)) And discussion touching this very issue at XRE #819.
<edit 3: And it looks like the version specification has a bug too: https://docs.nuget.org/create/versioning, and elaboration: http://weblog.west-wind.com/posts/2014/Jun/19/Nuget-Dependencies-and-latest-Versions and even more elaboration http://docs.nuget.org/release-notes/nuget-2.8 (it looks like this changes from release to release :)). The best fix, I suppose, would be to fix the version specification and the reflection loading bits (if there's a bug there, it's still somewhat a hypotesis, but I'm travelling at the moment).
Absent of better information, I'd take 2). I can supply the PR, but unfortunately I wasn't able to generate the Nuget packages and test my fix. The error messages was