dotnet / orleans

Cloud Native application framework for .NET
https://docs.microsoft.com/dotnet/orleans
MIT License
10.1k stars 2.04k forks source link

Orleans does not respect policies (binding redirects) #174

Closed veikkoeeva closed 9 years ago

veikkoeeva commented 9 years ago

Now that I've downloaded the latest binaries from Nuget, I see I bumped the same issue @ReubenBond mentioned here: Orleans does not respect binding redirects. I should have paid more attention to that, sorry for being slow with this. :)

I see AssemblyLoader makes use of Assembly.ReflectionOnlyLoadFrom at line 277 and I suspect this is the reason (without actually debugging). According to the documentation this does not respect binding redirects. It would look like one needs to use AppDomain.ApplyPolicy to get this behavior. The fix would be simply following:

pathName = AppDomain.CurrentDomain.ApplyPolicy(pathName); //This line is added as a fix.
assembly = Assembly.ReflectionOnlyLoadFrom(pathName); //And from here onwards it'd be as it were.

Currently the Nuget packages are marked to indicate either the packages Orleans references or newer are accepted, so this looks like a rather nasty bug. There are two solutions: 1) Fix Nuget specification so that only exact versions are allowed. 2) Allow applying policies.

<edit: I see there's also Assembly.LoadFile used in AssemblyLoader. I'm not sure if it respects redirects, but probably no harm would be done if policies would be applied.

<edit 2: All right, this little fix may not work. I hunt down one prominent example: XRE Runtime issue #563 ([line 178](https://github.com/aspnet/XRE/blob/bb3cdbdb67ca04366fcb9a04e56d0dc05752debb/src/klr.hosting.shared/RuntimeBootstrapper.cs#L178)) And discussion touching this very issue at XRE #819.

<edit 3: And it looks like the version specification has a bug too: https://docs.nuget.org/create/versioning, and elaboration: http://weblog.west-wind.com/posts/2014/Jun/19/Nuget-Dependencies-and-latest-Versions and even more elaboration http://docs.nuget.org/release-notes/nuget-2.8 (it looks like this changes from release to release :)). The best fix, I suppose, would be to fix the version specification and the reflection loading bits (if there's a bug there, it's still somewhat a hypotesis, but I'm travelling at the moment).

Absent of better information, I'd take 2). I can supply the PR, but unfortunately I wasn't able to generate the Nuget packages and test my fix. The error messages was

C:\projektit\orleans\src\nuget> .\CreateOrleansPackages.bat C:\projektit\Orleans\Binaries\Release C: \projektit\Orleans\Binaries\Release\Version.txt CreateOrleansNugetPackages running in directory = C:\projektit\orleans\src\nuget CreateOrleansNugetPackages version file = C:\projektit\Orleans\Binaries\Release\Version.txt from base dir = using nuget location = C:\projektit\Orleans\src\NuGet\..\.nuget\nuget.exe Using binary drop location directory C:\projektit\Orleans\Binaries\Release Using version number from file C:\projektit\Orleans\Binaries\Release\Version.txt CreateOrleansNugetPackages: Version = 1.0.3 -- Drop location = C:\projektit\Orleans\Binaries\Release Attempting to build package from 'Microsoft.Orleans.Client.nuspec'. System.IO.FileNotFoundException: File not found: 'ClientInstall.ps1'. kohteessa NuGet.PackageBuilder.AddFiles(String basePath, String source, String destination, String exclude) kohteessa NuGet.PackageBuilder.PopulateFiles(String basePath, IEnumerable`1 files) kohteessa NuGet.PackageBuilder.ReadManifest(Stream stream, String basePath, IPropertyProvider propertyProvider) kohteessa NuGet.PackageBuilder..ctor(String path, String basePath, IPropertyProvider propertyProvider, Boolean includ eEmptyDirectories) kohteessa NuGet.Commands.PackCommand.CreatePackageBuilderFromNuspec(String path) kohteessa NuGet.Commands.PackCommand.BuildFromNuspec(String path) kohteessa NuGet.Commands.PackCommand.BuildPackage(String path) kohteessa NuGet.Commands.PackCommand.ExecuteCommand() kohteessa NuGet.Commands.Command.Execute() kohteessa NuGet.Program.Main(String[] args)
ReubenBond commented 9 years ago

:+1: Great investigative work! I had suspected that exact line, but was unable to find the fix - well done! This should fix a bunch of gotchas.

veikkoeeva commented 9 years ago

@ReubenBond You might get a temporary fix by using AppDomain.AssemblyResolve and upon hitting the handler, use ApplyPolicy and then return the new, redirected assembly. I'm not sure if this works, but it's worth a try. Also, if you happen to know how to test this particular fix, feel free to do a PR. It may take a day or two I'm back. I used my programming quota, kids are complaining they are hungry etc. :)

veikkoeeva commented 9 years ago

A link to Codeplex discussion about this.

veikkoeeva commented 9 years ago

Some quick further debugging on this. Putting AppDomain.AssemblyResolve handler to one's own code doesn't get called. The reason is probably that it's already handler in AssemblyLoader line 141. As can be seen, the calls end up in CachedReflectionOnlyTypeResolver.

I did a quick try with the following (awful code, misleading printing, buggy etc., so take it with these warnings) and I could see there were assemblies with their old versions and they'd hit a breakpoint in the first return Assembly.LoadFrom(afterPolicy);, but loading the assembly fails with a message there's already one loaded with the same name. I didn't investigate further, but the exception messages were about conflicting assemblies (an assembly with the same name but different version had been loaded already?).

I don't know the exact purpose with these classes and so I don't have a clear clue on where and how should one do the policy resolution. Perhaps @gabikliot, @jthelin or @sergeybykov could weigh in here. Or someone doing some brute force debugging. :) In any event, it looks like assembly redirections won't work before this is fixed (with a note to Nuget dependency definitions).

 public static Assembly OnReflectionOnlyAssemblyResolve(object sender, ResolveEventArgs args)
    {
        // loading into the reflection-only context doesn't resolve dependencies automatically.
        // we're faced with the choice of ignoring arguably false dependency-missing exceptions or
        // loading the dependent assemblies into the reflection-only context. 
        //
        // i opted to load dependencies (by implementing OnReflectionOnlyAssemblyResolve)
        // because it's an opportunity to quickly identify assemblies that wouldn't load under
        // normal circumstances.
        var appDomain = (AppDomain)sender;
        var afterPolicy = appDomain.ApplyPolicy(args.Name);
        try
        {                
            if(afterPolicy != args.Name)
            {
                return Assembly.LoadFrom(afterPolicy);
            }

            return Assembly.ReflectionOnlyLoad(args.Name);
        }
        catch (FileNotFoundException)
        {
            if (logger.IsVerbose2)
            {
                logger.Verbose2(FormatReflectionOnlyAssemblyResolveFailureMessage(sender, args));
            }

            var dirName = Path.GetDirectoryName(args.RequestingAssembly.Location);
            var assemblyName = new AssemblyName(afterPolicy != args.Name ? afterPolicy : args.Name);
            var fileName = string.Format("{0}.dll", assemblyName.Name);
            var pathName = Path.Combine(dirName, fileName);
            if (logger.IsVerbose2)
            {
                logger.Verbose2("failed to find assembly {0} in {1}; searching for {2} instead.",
                    assemblyName.FullName, dirName, pathName);
            }

            try
            {
                var afterPolicy2 = appDomain.ApplyPolicy(pathName);
                if(afterPolicy2 != pathName)
                {
                    return Assembly.LoadFrom(pathName);
                }

                return Assembly.ReflectionOnlyLoadFrom(pathName);
            }
            catch (FileNotFoundException)
            {
                if(logger.IsVerbose2)
                {
                    logger.Verbose(FormatReflectionOnlyAssemblyResolveFailureMessage(sender, args));
                }

                throw;
            }
        }
    }

Ah, as a tangential note, it feels this reflection code could do some refactoring provided its purpose is made more clear. <edit 1: With refactoring I was thinking something like

  1. Get the initial list of assemblies
  2. Process each of them, perhaps straight away with .AppyPolicy, processing could be done, say, like in style of "file processing in LINQ", maybe even in parallel, caching and all applied.
  3. It feels some of the AppDomain code could be simpler, perhaps there are even libraries for it
  4. Perhaps some optimizations like not probing the GAC if public key token is null
  5. Perhaps the list of assemblies could be provided by the application developer, further reducing load time. It looks like the code is for collecting grain assemblies (loading them for inspection to reflection only context, though that is one the reasons I'm a bit puzzled if the probing and touching the assemblies marked for redirection should fail or if something else is afoot, in fact, perhaps the ApplyPolicy should be tried in AssemblyLoader as a first thing, caching moved to there and CachedReflectionOnlyTypeResolver removed?).
sergeybykov commented 9 years ago

@veikkoeeva Great in-depth analysis. Thank you! I hope your kids didn't starve too badly. :-)

These are some of the murkiest code paths with assembly resolution not always working the way you expect. It doesn't help that Assembly.LoadFrom() has been recommended against for quite some time. We'll dig deeper into this.

veikkoeeva commented 9 years ago

No kids harmed in the process of debugging, it didn't take that a long time. :) As for general interest and reference Best Practices for Assembly Loading [MSDN].

jthelin commented 9 years ago

The AssemblyLoader is also used for finding / loading the different types of provider plugins and not just grain classes, which may help to explain why it is more general-purpose than it might need to be.

When debugging these types of type loading problems, i have always found it important to check not just when .NET thinks is the "Name" of the assembly, but also the "Location" [aka Java "codebase"] that it is coming from. http://stackoverflow.com/questions/3623358/two-types-not-equal-that-should-be/3624044#3624044

Assembly.LoadFrom() could quite possibly be the most evil function in the whole .NET Framework! :(

veikkoeeva commented 9 years ago

@jthelin I suspect the reason for redirects to fail is that LoadFrom loads the already loaded assembly even if a redirected one is provided. That is, it has already loaded one without applying any policies and later if the same assembly comes along elsewhere by some method (e.g. by applying a policy), an exception will be thrown that the assembly has been loaded already.

I saw this behaviour documented in MSDN, and various blogs. I see from the documentation that LoadFrom should respect policies, so perhaps the fix is just call ApplyPolicy as the first thing before doing the first LoadFrom. If my memory serves me correctly, I remember seeing also loading to reflection only context, which has a different story. Junfeng's blog on 2004 has many posts regarding assembly loading, mentioning also AssemblyResolve (using the provided Bing search engine to the blog may be useful).

veikkoeeva commented 9 years ago

From #200.

I found a related bug in the AssemblyLoader in reflection-only inspection. Assembly.GetTypes() throws if dependencies aren't already loaded. I suspect it is blocking some of these scenarios, and in a non-deterministic way because it depends on the order of loading. I'll have a PR with a fix for it.

I am still confused with assembly binding redirects. As I mentioned in #197, I see redirects working if they are in OrleansHost.exe.config in the local case. What's the gap here?

@sergeybykov It's good to hear you found something, especially since it looks like being a Heisenbug.

I didn't the mention of redirects working. Perhaps it was in other thread? This feels like a valuable piece of information, but how do I use it? I just tried to deploy OrleansHost.exe.config to Azure Compute Emulator and I see OrleansHost.exe, OrleansHost.exe.config and, say, Microsoft.Data.OData version 5.6.3 there and when the silo is starting, I get an exception Microsoft.Data.OData or one of its dependencies could not be loaded due to it being either missing or not corresponding to the defined reference (*). In the exe.config I have a redirect defined as follows

<dependentAssembly>
        <assemblyIdentity name="Microsoft.Data.OData" publicKeyToken="31bf3856ad364e35" culture="neutral" />
        <bindingRedirect oldVersion="0.0.0.0-5.6.3.0" newVersion="5.6.3.0" />
 </dependentAssembly>

(*)

WaWorkerHost.exe Error: 0 : [2015-03-16 19:52:47.670 GMT 6 ERROR 100833 OrleansSiloInstanceManager ] !!!!!!!!!! INTERNAL FAILURE! About to crash! Fail message is: Exception trying to create or connect to the Azure table: Microsoft.WindowsAzure.Storage.StorageException: Tiedostoa tai kokoonpanoa Microsoft.Data.OData, Version=5.6.2.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35 tai jotakin sen riippuvuutta ei voi ladata. Löydetty kokoonpanon luettelomääritys ei vastaa kokoonpanon viittausta. (HRESULT-poikkeus: 0x80131040) ---> System.IO.FileLoadException: Tiedostoa tai kokoonpanoa Microsoft.Data.OData, Version=5.6.2.0, Culture=neutral, PublicKeyToken=31bf3856ad364e35 tai jotakin sen riippuvuutta ei voi ladata. Löydetty kokoonpanon luettelomääritys ei vastaa kokoonpanon viittausta. (HRESULT-poikkeus: 0x80131040)

It's starting to be late, so perhaps I've glared over something obvious. When debuggin this, admitedly just a short while, I did see that also the newer versions were tried, but at that point the older ones were already loaded an the loader library threw an exception that the same assembly couldn't be loaded to the same context.

sergeybykov commented 9 years ago

@veikkoeeva Let me put together the reflection-only fix. I have a crude one. I want to make it less ugly. Let's see if this error will go away with it. Or not.

veikkoeeva commented 9 years ago

@sergeybykov It will fix some problems, that's for sure, so it makes world a bit better place. :) I forgot to mention I'm using the Nuget packages currently if that factors into this. Also, I wonder, how big a task would it be to redo the loader so as to make it to a set of Funcs that get a list of assemblies with some rules and then provide that list to a separate loader? I've understood Orleankka works like this. That feels like teh solution, though requires probably more planning.

cata commented 9 years ago

From #200

I am still confused with assembly binding redirects. As I mentioned in #197, I see redirects working if they are in OrleansHost.exe.config in the local case. What's the gap here?

I've tested this with (95375c8). I've built the SDK from source and installed it. It works (in my scenario, at least), both in the Azure emulator and when deployed to Azure. I have not tested this using the Orleans nuget packages.

Context:

The following binding redirect section is what works in my the above scenario:

    <assemblyBinding xmlns="urn:schemas-microsoft-com:asm.v1">
      <dependentAssembly>
        <assemblyIdentity name="Newtonsoft.Json" publicKeyToken="30ad4fe6b2a6aeed" culture="neutral" />
        <bindingRedirect oldVersion="0.0.0.0-6.0.0.0" newVersion="6.0.0.0" />
      </dependentAssembly>
      <dependentAssembly>
        <assemblyIdentity name="Microsoft.Data.Edm" publicKeyToken="31bf3856ad364e35" culture="neutral" />
        <bindingRedirect oldVersion="0.0.0.0-5.6.3.0" newVersion="5.6.3.0" />
      </dependentAssembly>
      <dependentAssembly>
        <assemblyIdentity name="Microsoft.Data.Services.Client" publicKeyToken="31bf3856ad364e35" culture="neutral" />
        <bindingRedirect oldVersion="0.0.0.0-5.6.3.0" newVersion="5.6.3.0" />
      </dependentAssembly>
      <dependentAssembly>
        <assemblyIdentity name="System.Spatial" publicKeyToken="31bf3856ad364e35" culture="neutral" />
        <bindingRedirect oldVersion="0.0.0.0-5.6.3.0" newVersion="5.6.3.0" />
      </dependentAssembly>
      <dependentAssembly>
        <assemblyIdentity name="Microsoft.Data.OData" publicKeyToken="31bf3856ad364e35" culture="neutral" />
        <bindingRedirect oldVersion="0.0.0.0-5.6.3.0" newVersion="5.6.3.0" />
      </dependentAssembly>
      <dependentAssembly>
        <assemblyIdentity name="Microsoft.WindowsAzure.Storage" culture="neutral" publicKeyToken="31bf3856ad364e35" />
        <bindingRedirect oldVersion="0.0.0.0-4.3.0.0" newVersion="4.3.0.0" />
      </dependentAssembly>
      <dependentAssembly>
        <assemblyIdentity name="Microsoft.WindowsAzure.ServiceRuntime" culture="neutral" publicKeyToken="31bf3856ad364e35" />
        <bindingRedirect oldVersion="0.0.0.0-2.5.0.0" newVersion="2.5.0.0" />
      </dependentAssembly>

@sergeybykov I am not sure why this has not worked for me before - I'm glad it works now and I think it means case 1) in #200 works as intended.

sergeybykov commented 9 years ago

I am sure I was able to reproduce the binding redirect issue but I was testing the case where the grain assembly and its dependencies were in a subfolder. https://github.com/dotnet/orleans/pull/236 fixed it for me. In the process of investigating this I also found a strange issue with dependency resolution in the reflection-only context. I included a workaround for it in https://github.com/dotnet/orleans/pull/236.

veikkoeeva commented 9 years ago

@sergeybykov, @ca-ta Ah, I didn't have a build from the latest source, but the latest Nuget packages. Perhaps that the issue for me. If it works for @ca-ta, it works for me, I believe. <edit: would have had use for a fresh brain yesterday, after all. :) Thanks @sergeybykov and @ca-ta.