NVIDIA-Merlin / HugeCTR

HugeCTR is a high efficiency GPU framework designed for Click-Through-Rate (CTR) estimating training
Apache License 2.0
905 stars 196 forks source link

support lock-free hashmap backend #436

Open ZhuYuJin opened 6 months ago

ZhuYuJin commented 6 months ago

This is a draft implementation to remove global mutex in HashMapBackend. This PR tries to avoiding locking the whole hashmap backend while updating or inserting new payload because our online service exists writing and reading actions at the same time. We need to ensure reading performance while writing actions exist.

The following is my change ideas:

  1. I involve an atomic variable in Payload to implement a row lock.
  2. I involve a mutex in Partition to protect insert a new payload action. Besides, parallel_flat_hash_map ensures the atomicity of adding/removing/updating an entry in flat hash map.