cloudflare / quiche

đŸ¥§ Savoury implementation of the QUIC transport protocol and HTTP/3
https://docs.quic.tech/quiche/
BSD 2-Clause "Simplified" License
9.51k stars 724 forks source link

Use of identity hash function may lead to suboptimal performance #1239

Open withoutboats opened 2 years ago

withoutboats commented 2 years ago

Looking at the code, I noticed that stream IDs are hashed using the identity hash function (added in #1149).

Stream IDs are unique u64s, but I believe they are not necessarily evenly distributed across the space of possible values. This makes the identity hash function a suboptimal hash function for this case; if the hash values are not evenly distributed, the rate of hash collisions will be higher than necessary.

I'm not sure the decision to use the identity hash function in this case was well considered. I would consider using a different hash function.

In #996, it is also claimed that using the default hasher is not necessary because stream IDs are just numbers. But the key factor in choosing a hasher like FxHash instead of the standard hasher is that the standard hasher performs worse because it is DDoS resistant. The problem is that if the key value is chosen by the adversary, they can choose keys that increase the probably of a collision, resulting in degraded service. My understanding of QUIC is that stream IDs can be chosen by the other party in the connection, and therefore an adversary that has connected to you could choose stream IDs that cause collisions, increasing the CPU time serving their connection takes by creating collisions in these tables. Note this is also true for non-DDoS resistant hashers that are not the identity function, because if the adversary knows the hash function used they can still choose stream IDs which collide.

I could be missing something, but I would encourage going back to a DDoS resistant hash function for stream tables.

ljluestc commented 2 months ago
// Import the FxHash hasher
use fxhash::FxHasher;
use std::collections::HashMap;
use std::hash::{BuildHasherDefault, Hash, Hasher};

// Define a type alias for the FxHash hasher
type FxBuildHasher = BuildHasherDefault<FxHasher>;

// Define a HashMap using FxHash
type StreamIdHashMap<T> = HashMap<u64, T, FxBuildHasher>;

// Example function using the new hash function
fn process_stream_ids(stream_ids: &[u64]) {
    let mut map: StreamIdHashMap<()> = StreamIdHashMap::default();

    for id in stream_ids {
        map.insert(*id, ());
    }

    // Further processing...
}