If the cache entry is a value type larger than the native pointer size (e.g. a Guid), writes are not atomic and if the value is updated and read concurrently, readers may see a torn value.

There are at least two ways to solve this:

Lock the item containing the value on read. This PR implements option 1 using a SeqLock.
Make a new LruItem and update the dictionary. See PR #545.

Option 1 is preferred, because option 2 can make cache size unstable (stale values consume queue slots, pushing out live cache entries).

SeqLock pros and cons:

Pros: reads are lock free. Reads cannot starve writes. Throughput for Read and Read+Write impact is pretty low. Lookup latency hardly changes. Good choice when reads outnumber writes or writes are not frequent.
Cons:
- Can live lock. This is reproable via the unit test with a very tight loop on LruItem, but when exercised in the context of the cache codepath impact is less (see ConcurrentLruSoakTests.WhenValueIsBigStructNoLiveLock vs LruItemSoakTests.DetectTornStruct). Updating the same item in an extremely tight loop is not the common case. Live lock is mitigated in this improved version, but requires more memory and returns a stale result.
- LruItem size increased by 4 bytes. For the common case (x64, object/Guid key with object reference), this fits inside the padding. But for other cases, there is no way a user can remove it.

Atomic/Scoped etc.

The update code paths for atomic/scoped caches generate new wrapper class instances and call cache update to replace the object. They are therefore not susceptible to torn reads - the structs inside them are not changed after the wrapper is created.

be8f0ed835b24bf6303adb470ec531d4c3415ddb)

Naive implementation using a C# lock statement added to read makes LRU roughly the same latency as LFU with a Guid value. Since LruItem is already locked on update, torn reads are prevented by the lock. This comes with the overhead of the lock, which results in lock contention for concurrent reads.

BitFaster Caching Benchmarks LruJustGetOrAddGuid- NET 6 0-columnchart

Read throughput is significantly reduced: Results_Read_500

Using SeqLock (https://github.com/bitfaster/BitFaster.Caching/pull/593/commits/87fcd06c3e991314ed3ca25387403b19b62b7850)

See SeqLock. This is a good fit for our scenario, because we already have a single threaded update, and we can keep reads lock free and fast.

We continue to lock on update (to support synchronization between two writers) and add a counter to enable consistency for readers.
LruItem now contains an extra 32 bit integer field to represent the SeqLock counter. This fits inside the padding for LruItem<object,object>, so for the common case does not incur a memory overhead.
Reads are lock free and retry if the sequence number changed while data was being read.

BitFaster Caching Benchmarks LruJustGetOrAddGuid- NET 6 0-columnchart Results_Read_500 Results_ReadWrite_500

bitfaster / BitFaster.Caching

Mitigate LRU struct tearing using SeqLock #593

Atomic/Scoped etc.

Using a lock statement (https://github.com/bitfaster/BitFaster.Caching/pull/593/commits/be8f0ed835b24bf6303adb470ec531d4c3415ddb)

Using SeqLock (https://github.com/bitfaster/BitFaster.Caching/pull/593/commits/87fcd06c3e991314ed3ca25387403b19b62b7850)