dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
14.97k stars 4.66k forks source link

Unoptimal hash code combine in TypeHashingAlgorithms.ComputeMethodHashCode #103070

Closed filipnavara closed 3 months ago

filipnavara commented 3 months ago

There's a TODO in the code:

https://github.com/dotnet/runtime/blob/b438888c5504c8dafe3746393ab76f3e5e1eff6d/src/coreclr/tools/Common/TypeSystem/Common/TypeHashingAlgorithms.cs#L229-L234

Notably, this comes up as a sore spot in my profiles:

image

Changing to HashCode.Combine yields about >20% less hash collisions in this method alone:

image

This compilation speed improvement is quite noticable.

Unfortunately, there seems to be some dependency on the hash combining algorithm somewhere in the code because this produces an executable that fails to run at runtime:

Process terminated. Failed to create generic virtual method implementation

Declaring type: Microsoft.Extensions.Options.OptionsBuilder`1<Microsoft.Extensions.Options.StartupValidatorOptions>
Method name: Configure
Instantiation:
  Argument 00000000: Microsoft.Extensions.Options.IOptionsMonitor`1<Microsoft.Extensions.DependencyInjection.MetricsServiceExtensions+NoOpOptions>

   at System.RuntimeExceptionHelpers.FailFast(String, Exception, String, RhFailFastReason, IntPtr, IntPtr) + 0x2b0
   at Internal.Runtime.TypeLoader.TypeLoaderEnvironment.ResolveGenericVirtualMethodTarget(RuntimeTypeHandle, RuntimeMethodHandle) + 0x210
   at System.Runtime.TypeLoaderExports.<>c.<GVMLookupForSlotSlow>b__8_0(IntPtr context, IntPtr signature, Object contextObject, IntPtr& auxResult) + 0x31
   at System.Runtime.TypeLoaderExports.CacheMiss(IntPtr, IntPtr, RuntimeObjectFactory, Object) + 0x28
   at System.Runtime.TypeLoaderExports.GVMLookupForSlotSlow(Object, RuntimeMethodHandle) + 0x4e
   at System.Runtime.TypeLoaderExports.GVMLookupForSlot(Object, RuntimeMethodHandle) + 0x9a
   at Microsoft.Extensions.DependencyInjection.OptionsBuilderExtensions.ValidateOnStart[TOptions](OptionsBuilder`1) + 0x6e
   at Microsoft.Extensions.DependencyInjection.MetricsServiceExtensions.AddMetrics(IServiceCollection) + 0x47
   at Microsoft.Extensions.DependencyInjection.HttpClientFactoryServiceCollectionExtensions.AddHttpClient(IServiceCollection) + 0x26
   at MailClient.Program.SetUpDependencyInjection() + 0x2a9
   at MailClient.Program.Main(String[] args) + 0xd2
dotnet-policy-service[bot] commented 3 months ago

Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas See info in area-owners.md if you want to be subscribed.

huoyaoyuan commented 3 months ago

At which level are you replacing with HashCode.Combine? I can't imagine anything faster than XOR if you only replace the ComputeMethodHashCode method.

filipnavara commented 3 months ago

I can't imagine anything faster than XOR if you only replace the ComputeMethodHashCode method.

The problem is not the speed of the hashing itself. That's done only once per each method. The problem are the collisions that are caused by matching hashes and the simple XOR is extremely bad at that.

The profile above is with TypeHashingAlgorithms.ComputeMethodHashCode replaced to use HashCode.Combine. There are places in unmanaged CLR/R2R code that depend on the stable hash algorithm at the moment. I expected these places not to exist on NativeAOT since the relevant part of the code links to System.Private.TypeLoader.dll which uses the shared TypeHashingAlgorithms implementation, but apparently there is still some dependency.

jkotas commented 3 months ago

HashCode.Combine is randomized. It does not produce a stable hashcode.

NativeAOT file format depends on the stable hashcodes for these things. It should be fine to improve this combining function, but it needs be stable.

filipnavara commented 3 months ago

HashCode.Combine is randomized. It does not produce a stable hashcode.

Huh, that explains why it didn't work. I guess I should file a documentation issue since the current documentation doesn't mention this.

filipnavara commented 3 months ago

I tried using stable combination function:

        public static int ComputeMethodHashCode(int typeHashCode, int nameOrNameAndGenericArgumentsHashCode)
        {
            return unchecked((typeHashCode * (int)0xA5555529) + nameOrNameAndGenericArgumentsHashCode);
        }

There's likely still some dependency on the older algorithm. The app crashes at runtime at different place though:

System.NotSupportedException: 'Microsoft.Extensions.Options.OptionsMonitor`1[Microsoft.Extensions.Http.HttpClientFactoryOptions]' is missing native code or metadata. This can happen for code that is not compatible with trimming or AOT. Inspect and fix trimming and AOT related warnings that were generated when the app was published. For more information see https://aka.ms/nativeaot-compatibility
   at System.Reflection.Runtime.General.TypeUnifier.WithVerifiedTypeHandle(RuntimeConstructedGenericTypeInfo, RuntimeTypeInfo[]) + 0x7f
   at System.Reflection.Runtime.TypeInfos.RuntimeTypeInfo.MakeGenericType(Type[]) + 0x218
   at System.RuntimeType.MakeGenericType(Type[]) + 0x28
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.TryCreateOpenGeneric(ServiceDescriptor, ServiceIdentifier, CallSiteChain, Int32, Boolean) + 0xc0
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.TryCreateOpenGeneric(ServiceIdentifier, CallSiteChain) + 0xf9
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.CreateCallSite(ServiceIdentifier serviceIdentifier, CallSiteChain callSiteChain) + 0xb5
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.CreateArgumentCallSites(ServiceIdentifier, Type, CallSiteChain, ParameterInfo[], Boolean) + 0x143
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.CreateConstructorCallSite(ResultCache, ServiceIdentifier, Type, CallSiteChain) + 0x224
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.TryCreateExact(ServiceDescriptor, ServiceIdentifier, CallSiteChain, Int32) + 0xe9
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.TryCreateExact(ServiceIdentifier, CallSiteChain) + 0xd8
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.CreateCallSite(ServiceIdentifier serviceIdentifier, CallSiteChain callSiteChain) + 0xa0
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.GetCallSite(ServiceIdentifier, CallSiteChain) + 0x5d
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.CreateServiceAccessor(ServiceIdentifier serviceIdentifier) + 0x5a
   at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd(TKey, Func`2) + 0xd4
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.GetService(ServiceIdentifier, ServiceProviderEngineScope) + 0x28
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.ServiceProviderEngineScope.GetService(Type) + 0x23
   at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService(IServiceProvider, Type) + 0x30
   at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService[T](IServiceProvider) + 0x21
   at Microsoft.Extensions.DependencyInjection.HttpClientFactoryServiceCollectionExtensions.<>c.<AddHttpClient>b__0_0(IServiceProvider serviceProvider) + 0xc
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.VisitFactory(FactoryCallSite, RuntimeResolverContext) + 0xd
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteVisitor`2.VisitCallSiteMain(ServiceCallSite, TArgument) + 0xd5
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.VisitRootCache(ServiceCallSite, RuntimeResolverContext) + 0x4b
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteVisitor`2.VisitCallSite(ServiceCallSite callSite, TArgument argument) + 0x91
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.Resolve(ServiceCallSite, ServiceProviderEngineScope) + 0x1c
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.CreateServiceAccessor(ServiceIdentifier serviceIdentifier) + 0x151
   at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd(TKey, Func`2) + 0xd4
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.GetService(ServiceIdentifier, ServiceProviderEngineScope) + 0x28
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.ServiceProviderEngineScope.GetService(Type) + 0x23
   at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService(IServiceProvider, Type) + 0x30
   at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService[T](IServiceProvider) + 0x21
   at MailClient.Program.<>c.<SetUpDependencyInjection>b__290_9(IServiceProvider sp) + 0x1f
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.VisitFactory(FactoryCallSite, RuntimeResolverContext) + 0xd
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteVisitor`2.VisitCallSiteMain(ServiceCallSite, TArgument) + 0xd5
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.VisitRootCache(ServiceCallSite, RuntimeResolverContext) + 0x4b
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteVisitor`2.VisitCallSite(ServiceCallSite callSite, TArgument argument) + 0x91
   at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.Resolve(ServiceCallSite, ServiceProviderEngineScope) + 0x1c
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.CreateServiceAccessor(ServiceIdentifier serviceIdentifier) + 0x151
   at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd(TKey, Func`2) + 0xd4
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.GetService(ServiceIdentifier, ServiceProviderEngineScope) + 0x28
   at Microsoft.Extensions.DependencyInjection.ServiceProvider.GetService(Type) + 0xb
   at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService(IServiceProvider, Type) + 0x30
   at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService[T](IServiceProvider) + 0x21
   at MailClient.Program.Main(String[] args) + 0x4f3

The hash performs mildly better than the XOR, worse than the first attempt with HashCode.Combine:

image
vcsjones commented 3 months ago

. I guess I should file a documentation issue since the current documentation doesn't mention this.

It is, but in the class remarks. https://learn.microsoft.com/en-us/dotnet/api/system.hashcode?view=net-8.0#remarks

filipnavara commented 3 months ago

@vcsjones Thanks! I just searched for "HashCode.Combine" directly.

I went a bit more down the rabbit hole analyzing this and I am not necessarily sure that changing the hash function is a good enough fix.

In this particular instance we are running into a huge instance of LockFreeReaderHashtable that is particularly big and also quite full. It uses an 2^n sized hashtable with open addressing and double hashing in case of collision. The table is resized when a predetermined fill threshhold is reached. Notably, this is different from regular Dictionary<K, V> that uses prime-sized hash table, open addressing, linear probing, and detects high collision rates and resizes.

I don't have the numbers off-hand, but in the pessimistic case where the hashtable is nearly full, and you search for an element that is NOT in the hashtable (yet), you can end up searching linearly up to 60% (resize threshhold) of the table before finding out that the element is not there.

NodeFactory overwhelmingly uses the LockFreeReaderHashtable in a GetOrAdd/GetOrCreateValue fashion which lazily creates the values if they are not found. That ends up hitting the pessimistic lookup case rather often and badly no matter the quality of the hash code functions.

Overall, this data structure with these parameters doesn't scale up well. We can likely tweak some parameters of the Hashtable algorithm to get significant improvements but maybe a different approach is in order. Ideas that come to mind:

Both of these algorithms may be difficult to implement with the required concurrency requirements. That said, my benchmarks show the bottlenecks overwhelmingly happening in parts of the execution which are single-threaded. I am not necessarily good at algorithms, so any additional input is welcome.

jkotas commented 3 months ago

you can end up searching linearly up to 60% (resize threshhold)

This can only happen when the hashcodes are poorly distributed or when you are really unlucky.

I think fixing the hashcodes should take care of this problem.

filipnavara commented 3 months ago

This can only happen when the hashcodes are poorly distributed or when you are really unlucky.

They are definitely poorly distributed. There's no question about that. It may not be the only issue though and I don't have enough data to confirm/refute that yet.

I started tracking the collision numbers on Add (on a fairly large app build):

For scale, the regular Dictionary uses a 100 as collision threshold for rehashing.

filipnavara commented 3 months ago

I may have found one of the root causes. WinForms generates a large number of [System.Windows.Forms]SourceGenerated.EnumValidator Validate methods. They differ with the parameters signature but that's never taken into account in the MethodDesc hash, so they all cause a collision.

jkotas commented 3 months ago

For scale, the regular Dictionary uses a 100 as collision threshold for rehashing.

Nit: This threshold is used to switch to more expensive randomized string hash code and only for string keys. It is done for security reasons (protection against DoS attacks).

filipnavara commented 3 months ago

Few other notable collisions (same hash, but also same type and name):

    104 [System.Windows.Forms]SourceGenerated.EnumValidator Validate = 165151430
    107 [S.P.CoreLib]System.Runtime.CompilerServices.Unsafe Add = -821482047
    119 [S.P.CoreLib]Internal.Metadata.NativeFormat.MdBinaryReader Read = -456562824
    132 [S.P.CoreLib]System.Runtime.CompilerServices.Unsafe As = 1267830177
    174 [S.P.CoreLib]System.Runtime.CompilerServices.Unsafe AsRef = -693195293
    196 [S.P.CoreLib]Internal.Runtime.MethodTable Of = -396193247
    232 [S.P.CoreLib]System.Runtime.CompilerServices.Unsafe As = -740372691
    392 [S.P.CompilerGenerated]Internal.CompilerGenerated.<Module> CalliMarshallingMethodThunk = 1205279480
  21913 [S.P.CompilerGenerated]Internal.CompilerGenerated.<Module> DynamicInvoke = 292316103

First column is the number of such MethodDescs created.

jkotas commented 3 months ago

For MethodEntrypointHashtable and similar hashtables that use MethodDesc as the key, the problem can be worked around by switching to object hashcode. Change these two lines https://github.com/dotnet/runtime/blob/fb1fe7f3481fc88fa477268c18d60d824b83c314/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/DependencyAnalysis/NodeFactory.cs#L922-L923 to use RuntimeHelpers.GetHashCode. It is not the proper fix, but it would be better than nothing.

filipnavara commented 3 months ago

Is there runtime dependency on stability of the DynamicInvokeMethodThunk hash code?

This nearly removes the bottleneck:

--- a/src/coreclr/tools/Common/TypeSystem/IL/Stubs/DynamicInvokeMethodThunk.cs
+++ b/src/coreclr/tools/Common/TypeSystem/IL/Stubs/DynamicInvokeMethodThunk.cs
@@ -112,6 +112,11 @@ public override MethodSignature Signature
             }
         }

+        protected override int ComputeHashCode()
+        {
+            return base.ComputeHashCode() ^ _targetSignature.GetHashCode();
+        }
+
         public override string Name => "DynamicInvoke";

         public override string DiagnosticName => "DynamicInvoke";
image

I tried using RuntimeHelpers.GetHashCode but it seemingly didn't have the expected impact (measured by wall clock). I'll run it through profiler later on.