triton-inference-server / local_cache

Implementation of a local in-memory cache for Triton Inference Server's TRITONCACHE API
BSD 3-Clause "New" or "Revised" License
5 stars 1 forks source link

Update LRU in constant time #7

Closed rmccorm4 closed 1 year ago

rmccorm4 commented 1 year ago

Track validity of LRU iterator, and use it directly when updating an existing key for constant time updates.

Previously, the LRU logic was mistakenly doing O(N) search (std::find(list.begin(), list.end(), key)) through the LRU keys for every update, which significantly slowed down cache operations as the number of keys in the cache grew.

The LRU iterator logic was already there, but wasn't being used. We need a way to ensure the iterator is valid before using it to avoid undefined behavior. Rather than using a std::pair, I made a small struct to assign the name valid to the field for better readability over pair.first / pair.second.

Refs on std::list iterator validity after modifying the list:

References and iterators to the erased elements are invalidated. Other references and iterators are not affected.

Lists have the important property that insertion and splicing do not invalidate iterators to list elements, and that even removal invalidates only the iterators that point to the elements that are removed

I will add some kind of perf tests at increasingly larger cache sizes to verify it stays relatively flat when I add the other e2e python unit tests, but for the release the performance has been verified locally and existing functional tests all pass.

rmccorm4 commented 1 year ago

To make sure I'm understanding correctly, the problem was if the number of keys was large, std::find would take a long time since it had to iterate over all of the keys when updating the LRU order.

The std::find was required to update the position of LRU for the current key that is being modified.

The new logic would store the location of the CacheEntry as a part of the struct to make updating the LRU faster (if the key already exists in LRU, we know the place it is stored and would go directly to that location).

@Tabrizian exactly.

nnshah1 commented 1 year ago

Question: when we evict, we directly pop from the LRU, if we did the same on Insert - could we gurantee the iterator is always valid and avoid the need for a seperate flag? If I understand the flag is only needed once on insert -

guptap11 commented 1 year ago

Thanks @rmccorm4 for putting in the fix so quickly. Will this be available in next release 23.06 ?

rmccorm4 commented 1 year ago

Hi @guptap11, happy to help! Yes, this fix should be available in the 23.06 release.