Closed filipnavara closed 3 months ago
Tagging subscribers to this area: @agocke, @MichalStrehovsky, @jkotas See info in area-owners.md if you want to be subscribed.
At which level are you replacing with HashCode.Combine
? I can't imagine anything faster than XOR if you only replace the ComputeMethodHashCode
method.
I can't imagine anything faster than XOR if you only replace the
ComputeMethodHashCode
method.
The problem is not the speed of the hashing itself. That's done only once per each method. The problem are the collisions that are caused by matching hashes and the simple XOR is extremely bad at that.
The profile above is with TypeHashingAlgorithms.ComputeMethodHashCode
replaced to use HashCode.Combine
. There are places in unmanaged CLR/R2R code that depend on the stable hash algorithm at the moment. I expected these places not to exist on NativeAOT since the relevant part of the code links to System.Private.TypeLoader.dll which uses the shared TypeHashingAlgorithms
implementation, but apparently there is still some dependency.
HashCode.Combine
is randomized. It does not produce a stable hashcode.
NativeAOT file format depends on the stable hashcodes for these things. It should be fine to improve this combining function, but it needs be stable.
HashCode.Combine
is randomized. It does not produce a stable hashcode.
Huh, that explains why it didn't work. I guess I should file a documentation issue since the current documentation doesn't mention this.
I tried using stable combination function:
public static int ComputeMethodHashCode(int typeHashCode, int nameOrNameAndGenericArgumentsHashCode)
{
return unchecked((typeHashCode * (int)0xA5555529) + nameOrNameAndGenericArgumentsHashCode);
}
There's likely still some dependency on the older algorithm. The app crashes at runtime at different place though:
System.NotSupportedException: 'Microsoft.Extensions.Options.OptionsMonitor`1[Microsoft.Extensions.Http.HttpClientFactoryOptions]' is missing native code or metadata. This can happen for code that is not compatible with trimming or AOT. Inspect and fix trimming and AOT related warnings that were generated when the app was published. For more information see https://aka.ms/nativeaot-compatibility
at System.Reflection.Runtime.General.TypeUnifier.WithVerifiedTypeHandle(RuntimeConstructedGenericTypeInfo, RuntimeTypeInfo[]) + 0x7f
at System.Reflection.Runtime.TypeInfos.RuntimeTypeInfo.MakeGenericType(Type[]) + 0x218
at System.RuntimeType.MakeGenericType(Type[]) + 0x28
at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.TryCreateOpenGeneric(ServiceDescriptor, ServiceIdentifier, CallSiteChain, Int32, Boolean) + 0xc0
at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.TryCreateOpenGeneric(ServiceIdentifier, CallSiteChain) + 0xf9
at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.CreateCallSite(ServiceIdentifier serviceIdentifier, CallSiteChain callSiteChain) + 0xb5
at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.CreateArgumentCallSites(ServiceIdentifier, Type, CallSiteChain, ParameterInfo[], Boolean) + 0x143
at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.CreateConstructorCallSite(ResultCache, ServiceIdentifier, Type, CallSiteChain) + 0x224
at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.TryCreateExact(ServiceDescriptor, ServiceIdentifier, CallSiteChain, Int32) + 0xe9
at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.TryCreateExact(ServiceIdentifier, CallSiteChain) + 0xd8
at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.CreateCallSite(ServiceIdentifier serviceIdentifier, CallSiteChain callSiteChain) + 0xa0
at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteFactory.GetCallSite(ServiceIdentifier, CallSiteChain) + 0x5d
at Microsoft.Extensions.DependencyInjection.ServiceProvider.CreateServiceAccessor(ServiceIdentifier serviceIdentifier) + 0x5a
at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd(TKey, Func`2) + 0xd4
at Microsoft.Extensions.DependencyInjection.ServiceProvider.GetService(ServiceIdentifier, ServiceProviderEngineScope) + 0x28
at Microsoft.Extensions.DependencyInjection.ServiceLookup.ServiceProviderEngineScope.GetService(Type) + 0x23
at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService(IServiceProvider, Type) + 0x30
at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService[T](IServiceProvider) + 0x21
at Microsoft.Extensions.DependencyInjection.HttpClientFactoryServiceCollectionExtensions.<>c.<AddHttpClient>b__0_0(IServiceProvider serviceProvider) + 0xc
at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.VisitFactory(FactoryCallSite, RuntimeResolverContext) + 0xd
at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteVisitor`2.VisitCallSiteMain(ServiceCallSite, TArgument) + 0xd5
at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.VisitRootCache(ServiceCallSite, RuntimeResolverContext) + 0x4b
at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteVisitor`2.VisitCallSite(ServiceCallSite callSite, TArgument argument) + 0x91
at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.Resolve(ServiceCallSite, ServiceProviderEngineScope) + 0x1c
at Microsoft.Extensions.DependencyInjection.ServiceProvider.CreateServiceAccessor(ServiceIdentifier serviceIdentifier) + 0x151
at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd(TKey, Func`2) + 0xd4
at Microsoft.Extensions.DependencyInjection.ServiceProvider.GetService(ServiceIdentifier, ServiceProviderEngineScope) + 0x28
at Microsoft.Extensions.DependencyInjection.ServiceLookup.ServiceProviderEngineScope.GetService(Type) + 0x23
at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService(IServiceProvider, Type) + 0x30
at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService[T](IServiceProvider) + 0x21
at MailClient.Program.<>c.<SetUpDependencyInjection>b__290_9(IServiceProvider sp) + 0x1f
at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.VisitFactory(FactoryCallSite, RuntimeResolverContext) + 0xd
at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteVisitor`2.VisitCallSiteMain(ServiceCallSite, TArgument) + 0xd5
at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.VisitRootCache(ServiceCallSite, RuntimeResolverContext) + 0x4b
at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteVisitor`2.VisitCallSite(ServiceCallSite callSite, TArgument argument) + 0x91
at Microsoft.Extensions.DependencyInjection.ServiceLookup.CallSiteRuntimeResolver.Resolve(ServiceCallSite, ServiceProviderEngineScope) + 0x1c
at Microsoft.Extensions.DependencyInjection.ServiceProvider.CreateServiceAccessor(ServiceIdentifier serviceIdentifier) + 0x151
at System.Collections.Concurrent.ConcurrentDictionary`2.GetOrAdd(TKey, Func`2) + 0xd4
at Microsoft.Extensions.DependencyInjection.ServiceProvider.GetService(ServiceIdentifier, ServiceProviderEngineScope) + 0x28
at Microsoft.Extensions.DependencyInjection.ServiceProvider.GetService(Type) + 0xb
at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService(IServiceProvider, Type) + 0x30
at Microsoft.Extensions.DependencyInjection.ServiceProviderServiceExtensions.GetRequiredService[T](IServiceProvider) + 0x21
at MailClient.Program.Main(String[] args) + 0x4f3
The hash performs mildly better than the XOR, worse than the first attempt with HashCode.Combine
:
. I guess I should file a documentation issue since the current documentation doesn't mention this.
It is, but in the class remarks. https://learn.microsoft.com/en-us/dotnet/api/system.hashcode?view=net-8.0#remarks
@vcsjones Thanks! I just searched for "HashCode.Combine" directly.
I went a bit more down the rabbit hole analyzing this and I am not necessarily sure that changing the hash function is a good enough fix.
In this particular instance we are running into a huge instance of LockFreeReaderHashtable
that is particularly big and also quite full. It uses an 2^n sized hashtable with open addressing and double hashing in case of collision. The table is resized when a predetermined fill threshhold is reached. Notably, this is different from regular Dictionary<K, V>
that uses prime-sized hash table, open addressing, linear probing, and detects high collision rates and resizes.
I don't have the numbers off-hand, but in the pessimistic case where the hashtable is nearly full, and you search for an element that is NOT in the hashtable (yet), you can end up searching linearly up to 60% (resize threshhold) of the table before finding out that the element is not there.
NodeFactory
overwhelmingly uses the LockFreeReaderHashtable
in a GetOrAdd
/GetOrCreateValue
fashion which lazily creates the values if they are not found. That ends up hitting the pessimistic lookup case rather often and badly no matter the quality of the hash code functions.
Overall, this data structure with these parameters doesn't scale up well. We can likely tweak some parameters of the Hashtable algorithm to get significant improvements but maybe a different approach is in order. Ideas that come to mind:
Both of these algorithms may be difficult to implement with the required concurrency requirements. That said, my benchmarks show the bottlenecks overwhelmingly happening in parts of the execution which are single-threaded. I am not necessarily good at algorithms, so any additional input is welcome.
you can end up searching linearly up to 60% (resize threshhold)
This can only happen when the hashcodes are poorly distributed or when you are really unlucky.
I think fixing the hashcodes should take care of this problem.
This can only happen when the hashcodes are poorly distributed or when you are really unlucky.
They are definitely poorly distributed. There's no question about that. It may not be the only issue though and I don't have enough data to confirm/refute that yet.
I started tracking the collision numbers on Add (on a fairly large app build):
RuntimeDeterminedTypeKeyHashtable
gets over 2000 collisions but they do go down during expansion which suggests that they are not identical hash codes.MethodEntrypointHashtable
get over 14000 collisions and they only go up.For scale, the regular Dictionary
uses a 100 as collision threshold for rehashing.
I may have found one of the root causes. WinForms generates a large number of [System.Windows.Forms]SourceGenerated.EnumValidator Validate
methods. They differ with the parameters signature but that's never taken into account in the MethodDesc
hash, so they all cause a collision.
For scale, the regular Dictionary uses a 100 as collision threshold for rehashing.
Nit: This threshold is used to switch to more expensive randomized string hash code and only for string keys. It is done for security reasons (protection against DoS attacks).
Few other notable collisions (same hash, but also same type and name):
104 [System.Windows.Forms]SourceGenerated.EnumValidator Validate = 165151430
107 [S.P.CoreLib]System.Runtime.CompilerServices.Unsafe Add = -821482047
119 [S.P.CoreLib]Internal.Metadata.NativeFormat.MdBinaryReader Read = -456562824
132 [S.P.CoreLib]System.Runtime.CompilerServices.Unsafe As = 1267830177
174 [S.P.CoreLib]System.Runtime.CompilerServices.Unsafe AsRef = -693195293
196 [S.P.CoreLib]Internal.Runtime.MethodTable Of = -396193247
232 [S.P.CoreLib]System.Runtime.CompilerServices.Unsafe As = -740372691
392 [S.P.CompilerGenerated]Internal.CompilerGenerated.<Module> CalliMarshallingMethodThunk = 1205279480
21913 [S.P.CompilerGenerated]Internal.CompilerGenerated.<Module> DynamicInvoke = 292316103
First column is the number of such MethodDesc
s created.
For MethodEntrypointHashtable
and similar hashtables that use MethodDesc as the key, the problem can be worked around by switching to object hashcode. Change these two lines https://github.com/dotnet/runtime/blob/fb1fe7f3481fc88fa477268c18d60d824b83c314/src/coreclr/tools/aot/ILCompiler.Compiler/Compiler/DependencyAnalysis/NodeFactory.cs#L922-L923 to use RuntimeHelpers.GetHashCode. It is not the proper fix, but it would be better than nothing.
Is there runtime dependency on stability of the DynamicInvokeMethodThunk
hash code?
This nearly removes the bottleneck:
--- a/src/coreclr/tools/Common/TypeSystem/IL/Stubs/DynamicInvokeMethodThunk.cs
+++ b/src/coreclr/tools/Common/TypeSystem/IL/Stubs/DynamicInvokeMethodThunk.cs
@@ -112,6 +112,11 @@ public override MethodSignature Signature
}
}
+ protected override int ComputeHashCode()
+ {
+ return base.ComputeHashCode() ^ _targetSignature.GetHashCode();
+ }
+
public override string Name => "DynamicInvoke";
public override string DiagnosticName => "DynamicInvoke";
I tried using RuntimeHelpers.GetHashCode
but it seemingly didn't have the expected impact (measured by wall clock). I'll run it through profiler later on.
There's a TODO in the code:
https://github.com/dotnet/runtime/blob/b438888c5504c8dafe3746393ab76f3e5e1eff6d/src/coreclr/tools/Common/TypeSystem/Common/TypeHashingAlgorithms.cs#L229-L234
Notably, this comes up as a sore spot in my profiles:
Changing to
HashCode.Combine
yields about >20% less hash collisions in this method alone:This compilation speed improvement is quite noticable.
Unfortunately, there seems to be some dependency on the hash combining algorithm somewhere in the code because this produces an executable that fails to run at runtime: