LouisJenkinsCS / Persistent-Memory-Analysis-Tool

Persistent Memory Analysis Tool (PMAT) - Valgrind Plugin that provides an online sampling approach towards performing verification on persistent memory.
GNU General Public License v2.0
1 stars 0 forks source link

Explicit Data Structure for Simulated Processor Cache (SPC) #37

Open LouisJenkinsCS opened 4 years ago

LouisJenkinsCS commented 4 years ago

The SPC needs a more specialized and specific data structure than what is currently available. Right now it uses Valgrind's OSet, which is an ordered-set, implemented as an AVL Tree. As Michael has pointed out, it is easily possible to trim each store to a cache-line and then check in O(log(N)) (or even O(1)) time to check if it exists in the set. I am of the mindset that a Tree-based data structure which maintains the number of children it has, would be a much better choice. It has the following properties...

1) All access is O(log(N)) which is O(1) given that the size of the cache is bounded. 2) Randomly sampling can be performed by weighting the left and right children and performing randomized/stochastic traversal to find a child to evict. 3) Optimizations that are specific to the SPC can be leveraged to speedup performance, which could be essential, especially for large-scale applications.

LouisJenkinsCS commented 4 years ago

Idea: Use a Splay Tree...

This doubles as an LRU Cache as the most-recently used nodes reach the root of the tree, and the random-eviction algorithm will only sample from the leaf nodes, meaning you have a gigantic stochastic LRU cache. Performance is going to be excellent when cache-lines are accessed more often than others as well, thanks to the properties of the splay tree.

LouisJenkinsCS commented 4 years ago

Okay, so it is definitely more advantageous to just use the VGHashTable instead.

https://github.com/LouisJenkinsCS/Persistent-Memory-Analysis-Tool/blob/4dcb03f360b9e8ff7c2612bd5a79cb61f8ab08ae/coregrind/m_hashtable.c#L45-L53

LouisJenkinsCS commented 4 years ago

In the end, it would be best to implement a generic way of selecting a cache-eviction mechanism...

1) MRU 2) LRU 3) Stochastic LRU 4) Random Replacement