onflow / flow-go

A fast, secure, and developer-friendly blockchain built to support the next generation of games, apps, and the digital assets that power them.
GNU Affero General Public License v3.0
531 stars 175 forks source link

Implement segmented map #2285

Open synzhu opened 2 years ago

synzhu commented 2 years ago

There are many places in the codebase where we could use a concurrency safe map.

Using a single lock for a map may have poor performance, because all writes would need to be linearized. On the other hand, the sync.Map implementation will have worse performance for many usage patterns.

A middle ground is to implement a custom segmented map, which is a slice of 256 regular maps, each with its own RWLock.

To write to this map, we pick the first byte of the hashed key (for example) to determine the index into the slice. Then we use the lock to insert the item into that map.

We can optionally specify a key hashing function, which is by default the identity function. This way, if the keys come in a format where they are already randomized (e.g Flow ID, libp2p peer ID, md5sum of script data), then we just use it as is and don't need to waste effort hashing it.

Finally, we should benchmark the implementation for various different usage patterns, comparing it against the single-lock implementation and the sync.Map.

It would be great to implement this using interface{} values, so that it can be reused in different places, and once we migrate to Go 1.18 we would update this to use generics. However, some performance impact is unavoidable with the interface{} approach, and so we should also benchmark to compare performance vs an implementation which explicitly defines thr value type (e.g int).

synzhu commented 2 years ago

On second thought, the key hashing function should just be a "slice index picker" function. The default implementation picks the first (or last) byte of the key as the index