dotnet / runtime

.NET is a cross-platform runtime for cloud, mobile, desktop, and IoT apps.
https://docs.microsoft.com/dotnet/core/
MIT License
15.37k stars 4.75k forks source link

Hashtable has lower hash flooding resistance than Dictionary #108923

Open MihaZupan opened 1 month ago

MihaZupan commented 1 month ago

Description

Dictionary<string, ...> and some similar types have built-in defense-in-depth functionality that uses per-instance hash code randomization to defend against hash flooding attacks.

While Hashtable has similar logic on .NET Framework when dealing with string keys, it does not on modern .NET. While hash codes may be randomized if the instance was created with a comparer like Ordinal, they won't be randomized per-instance.

See "Instantiations known safe against hash flooding attacks" section of the Dictionary threat model being published in #108864 for more background. cc: @GrabYourPitchforks

Known Workarounds

Use a Dictionary instead :)

dotnet-policy-service[bot] commented 1 month ago

Tagging subscribers to this area: @dotnet/area-system-collections See info in area-owners.md if you want to be subscribed.

benaadams commented 1 month ago

Known Workarounds

Use a Dictionary instead :)

Use HashSet<string>?

Though string always uses randomized hashing and dictionary has optimization to use non-randomized hashing until high collisions?

MihaZupan commented 1 month ago

Hashtable stores values as well, hence Dictionry instead of HashSet.

Yes, it uses non-randomized hashes as an optimization, but when it does switch to randomized, it uses a different random seed for each dictionary instance instead of the shared per-process one.

karakasa commented 1 month ago

IIRC Hashtable is thread-safe while others are not, right?

PranavSenthilnathan commented 2 days ago

The work here would be to implement a similar mechanism as Dictionary<string,TValue> for built-in ordinal comparers (Ordinal and OrdinalIgnoreCase) to switch from a process wide random seed to an instance specific seed when the number of collisions is very high. As @karakasa mentioned, this data structure is thread-safe for readers when there is at most one writer. Currently, this is ensured by making sure the underlying array isn't modified in a thread-unsafe way as well as swapping the array during resizing/rehashing. With the addition of a mutable comparer, this resizing/rehashing step needs to ensure both the array and the comparer are swapped atomically as a unit (and on the reader side, they should be read together).