FuelLabs / sway

🌴 Empowering everyone to build reliable and efficient smart contracts.
https://docs.fuel.network/docs/sway/
Apache License 2.0
62.59k stars 5.37k forks source link

Consider using `FxHashMap` for performance #3162

Open emilyaherbert opened 2 years ago

emilyaherbert commented 2 years ago
    > > > Note that a copy only occurs if the data is shared with another instance, (i.e. if the OrdMap was previously cloned elsewhere) and only the shared parts are copied.

Ahhh I see. OrdMap it is!

Sorry for the late input. If determinism is the only thing that matter and not the actual order, then FxHashMap could be a more efficient option.

I'm already using this in the compiler. FxHashMap is faster too if the key type is small sized. If we want exactly HashMap's performance, but with determinism, then we should just instantiate HashMap with a non-random hasher, similar to what FxHashMap does (https://docs.rs/rustc-hash/latest/src/rustc_hash/lib.rs.html#43).

So something like pub type SwayHashMap<K, V> = HashMap<K, V, BuildHasherDefault<DefaultHasher>>; would give us exactly what HashMap gives, but with determinism.

Originally posted by @vaivaswatha in https://github.com/FuelLabs/sway/issues/3154#issuecomment-1292840895

vaivaswatha commented 2 years ago

I suggest the following (applies to HashSet too equally).

  1. Define pub type SwayHashMap<K, V> = HashMap<K, V, BuildHasherDefault<DefaultHasher>>;
  2. Use SwayHashMap wherever ordering doesn't matter, but just needs to be deterministic. Replace all uses of HashMap in the compiler with this.
  3. Use FxHashMap whenever the key size is just an integer (or something equal in size to that or smaller).
vaivaswatha commented 2 years ago

Generally fxhash is faster than fnv on u32, u64, or any byte sequence with length >= 5.

This is from a fork of FxHash (I think). (I somehow remembered the opposite, that for smaller keys, FxHash is better, but I think I was wrong).

This means we could just blindly use FxHashMap (and FxHashSet) everywhere as we don't typically have smaller than int size keys in the compiler.

Official GitHub page: https://github.com/rust-lang/rustc-hash

Edit: This page that FxHashMap is faster for integer keys, but doesn't mention other key types. Perhaps it would be best to benchmark for some common non-integer keys and then decide the best deterministic hash table we should use (i.e., choose b/w SwayHashMap that I defined above, or FxHashMap).