The AbstractCache object which has the common APIs as follows and there can be "custom implementations" with different data storage like file, database etc. with structural equivalent to the abstract object.
public type AbstractCache abstract object {
public function put(string key, any value, int maxAgeInSeconds) returns Error?;
public function get(string key) returns any|Error;
public function invalidate(string key) returns Error?;
public function invalidateAll() returns Error?;
public function hasKey(string key) returns boolean;
public function keys() returns string[];
public function size() returns int;
public function capacity() returns int;
};
The AbstractEvictionPolicy object which has the common APIs as follows and there can be "custom implementations" with different eviction algorithms with structural equivalent to the abstract object. That custom implementation has to maintain the LinkedList data structure according to the eviction algorithm.
public type AbstractEvictionPolicy abstract object {
public function get(LinkedList list, Node node);
public function put(LinkedList list, Node node);
public function remove(LinkedList list, Node node);
public function replace(LinkedList list, Node newNode, Node oldNode);
public function clear(LinkedList list);
public function evict(LinkedList list) returns Node?;
};
The Ballerina Cache module provides a Cache object, which is a map data structure based implementation of the AbstractCache object. It is not recommended to insert () as the value of the cache since it doesn't make any sense to cache a nil. Also, it provides the LruEvictionPolicy object, which is based on the LRU eviction algorithm.
While initializing the Cache, the developer have to pass the following parameters as the cache configurations.
capacity - Max number of entries allowed for the cache
evictionPolicy - The policy which defines the cache eviction algorithm
evictionFactor - The factor which the entries will be evicted once the cache full
defaultMaxAgeInSeconds - Freshness time of all the cache entries in seconds. This value can be overwritten by the maxAgeInSeconds property, when inserting entry to cache. '-1' means, the entries are valid forever.
cleanupIntervalInSeconds - The interval time of the timer task which cleans the cache entries. This is an optional parameter.
For the better user experience the above config is initialized with default values as follows:
public type CacheConfig record {|
int capacity = 100;
AbstractEvictionPolicy evictionPolicy = new LruEvictionPolicy();
float evictionFactor = 0.25;
int defaultMaxAgeInSeconds = -1;
int cleanupIntervalInSeconds?;
|};
Discussion
There are 2 mandatory scenarios + 1 optional scenario where a cache entry get removed from the cache and maintain the freshness of the cache entries. The 2 independent factors which are eviction policy & freshness time of the cache entry governs the 3 scenarios.
When using the get API, if the return cache entry has expired, it get removed.
When using the put API, if the cache size has reached its capacity, number of entries get removed based on the 'eviction policy' and 'eviction factor'.
If cleanupIntervalInSeconds (optional property) is configured, the timer task will remove the expired cache entries based on the configured interval.
The main benefit of using the cleanupIntervalInSeconds optional property is that the developer can optimize memory usage while adding some additional CPU costs and vice versa. The default behavior is the CPU optimized method.
The concept of the Cache object is purely based on the Ballerina map data structure and a linked list data structure. Please refer the mail https://groups.google.com/d/msg/ballerina-dev/DOsQyUqrCe0/ii-vkSXHAAAJ. The key of the map entry would be a string and the value of the map entry would be a node of the linked list.
public type Node record {|
any value;
Node? prev = ();
Node? next = ();
|};
While using the cache, a CacheEntry record will be created and added as the value of the Node record. Node record will be inserted into the map data structure against the provided string key.
type CacheEntry record {|
string key;
any data;
int expTime;
|};
The linked list data structure is purely used for the eviction of the cache. According to the user configured eviction policy, when inserting / updating / retrieving cache entries, the linked list data structure should get updated. Therefore, when eviction happens, cache entries can be removed efficiently, without iterating the complete map data structure.
Example: If the eviction policy is LRU, always the MRU item will be the head of the linked list. When eviction happens, nodes from the tail will be deleted without iterating the map.
Further, developers can implement a custom caching implementations based on different cache storage mechanisms (file, database etc.) and different eviction policies (MRU, FIFO etc.) as they wish. Ballerina provides a "map-based cache" as the default cache implementation.
Implementation
This implementation introduces the v2.0.0. of the Ballerina cache API and it replaces all the current usages of the cache. This will be released with Ballerina v1.2.0.
With this cache v2.0.0 we were able to gain average 3x performance improvement compared to cache v1.0.0.
Samples
Cache Initialization
A basic sample cache of 100 capacity, which uses LRU as the eviction policy and eviction factor is 0.25 is as follows:
cache:Cache cache = new;
A basic sample cache of 1000 capacity, the eviction factor is 0.2, cache entry default freshness time as 1 hour, and clean up timer configured with 5 seconds interval is as follows:
An advanced sample cache which uses a custom eviction policy along with the default capacity, eviction factor, max age and cleanup interval is as follows:
public type CustomEvictionPolicy object {
*cache:AbstractEvictionPolicy;
public function get(LinkedList list, Node node) { // custom implementation }
public function put(LinkedList list, Node node) { // custom implementation }
public function remove(LinkedList list, Node node) { // custom implementation }
public function replace(LinkedList list, Node newNode, Node oldNode) { // custom implementation }
public function clear(LinkedList list) { // custom implementation }
public function evict(LinkedList list) returns Node? { // custom implementation }
}
cache:Cache cache = new({
capacity: 1000,
evictionPolicy: new CustomEvictionPolicy()
});
Cache Usage
The simple way of using the initialized cache without handling errors is as follows:
The advanced way of using the initialized cache with error handling is as follows:
cache:Error? result = cache.put("key1", "value1");
if (result is cache:Error) {
// implement what to do, if any error happen when inserting item to cache
}
any|cache:Error result = cache.get("key1");
if (result is cache:Error) {
// implement what to do, if any error happen when retrieving item from the cache
}
string value = <string>result;
cache:Error? result = check cache.invalidate("key1");
if (result is cache:Error) {
// implement what to do, if any error happen when discarding item from the cache
}
cache:Error? result = check cache.invalidateAll();
if (result is cache:Error) {
// implement what to do, if any error happen when discarding item from the cache
}
boolean hasKey = cache.hasKey("key1");
string[] keys = cache.keys();
int size = cache.size();
int capacity = cache.capacity();
Description: This issue is created to track the implementation step of https://github.com/ballerina-platform/ballerina-lang/issues/20794
Discussion related to the design: https://groups.google.com/d/msg/ballerina-dev/Im0BpGqwBhU/zuIdkZ9FBQAJ
Summary
The
AbstractCache
object which has the common APIs as follows and there can be "custom implementations" with different data storage like file, database etc. with structural equivalent to the abstract object.The
AbstractEvictionPolicy
object which has the common APIs as follows and there can be "custom implementations" with different eviction algorithms with structural equivalent to the abstract object. That custom implementation has to maintain theLinkedList
data structure according to the eviction algorithm.The Ballerina Cache module provides a
Cache
object, which is a map data structure based implementation of theAbstractCache
object. It is not recommended to insert()
as the value of the cache since it doesn't make any sense to cache a nil. Also, it provides theLruEvictionPolicy
object, which is based on the LRU eviction algorithm.While initializing the
Cache
, the developer have to pass the following parameters as the cache configurations.capacity
- Max number of entries allowed for the cacheevictionPolicy
- The policy which defines the cache eviction algorithmevictionFactor
- The factor which the entries will be evicted once the cache fulldefaultMaxAgeInSeconds
- Freshness time of all the cache entries in seconds. This value can be overwritten by themaxAgeInSeconds
property, when inserting entry to cache. '-1' means, the entries are valid forever.cleanupIntervalInSeconds
- The interval time of the timer task which cleans the cache entries. This is an optional parameter.For the better user experience the above config is initialized with default values as follows:
Discussion
There are 2 mandatory scenarios + 1 optional scenario where a cache entry get removed from the cache and maintain the freshness of the cache entries. The 2 independent factors which are eviction policy & freshness time of the cache entry governs the 3 scenarios.
get
API, if the return cache entry has expired, it get removed.put
API, if the cache size has reached its capacity, number of entries get removed based on the 'eviction policy' and 'eviction factor'.cleanupIntervalInSeconds
(optional property) is configured, the timer task will remove the expired cache entries based on the configured interval.The main benefit of using the
cleanupIntervalInSeconds
optional property is that the developer can optimize memory usage while adding some additional CPU costs and vice versa. The default behavior is the CPU optimized method.The concept of the
Cache
object is purely based on the Ballerina map data structure and a linked list data structure. Please refer the mail https://groups.google.com/d/msg/ballerina-dev/DOsQyUqrCe0/ii-vkSXHAAAJ. The key of the map entry would be a string and the value of the map entry would be a node of the linked list.While using the cache, a
CacheEntry
record will be created and added as the value of theNode
record.Node
record will be inserted into the map data structure against the provided string key.The linked list data structure is purely used for the eviction of the cache. According to the user configured eviction policy, when inserting / updating / retrieving cache entries, the linked list data structure should get updated. Therefore, when eviction happens, cache entries can be removed efficiently, without iterating the complete map data structure.
Example: If the eviction policy is LRU, always the MRU item will be the head of the linked list. When eviction happens, nodes from the tail will be deleted without iterating the map.
Further, developers can implement a custom caching implementations based on different cache storage mechanisms (file, database etc.) and different eviction policies (MRU, FIFO etc.) as they wish. Ballerina provides a "map-based cache" as the default cache implementation.
Implementation
This implementation introduces the v2.0.0. of the Ballerina cache API and it replaces all the current usages of the cache. This will be released with Ballerina v1.2.0.
With this cache v2.0.0 we were able to gain average 3x performance improvement compared to cache v1.0.0.
Samples
Cache Initialization
A basic sample cache of 100 capacity, which uses LRU as the eviction policy and eviction factor is 0.25 is as follows:
A basic sample cache of 1000 capacity, the eviction factor is 0.2, cache entry default freshness time as 1 hour, and clean up timer configured with 5 seconds interval is as follows:
An advanced sample cache which uses a custom eviction policy along with the default capacity, eviction factor, max age and cleanup interval is as follows:
Cache Usage
The simple way of using the initialized cache without handling errors is as follows:
The advanced way of using the initialized cache with error handling is as follows:
References
The design was influenced by https://github.com/chethiya/ballerina-cache by @chethiya
Related Discussions:
https://groups.google.com/d/msg/ballerina-dev/2aECiDPnL6A/5s9XmoG2AAAJ https://groups.google.com/d/msg/ballerina-dev/4VkPqATJ1tI/63PW3MR0DgAJ
Related Resource:
https://en.wikipedia.org/wiki/Cache_replacement_policies https://guava.dev/releases/21.0/api/docs/com/google/common/cache/Cache.html http://chethiya.github.io/ballerina-lru-cache.html https://github.com/chethiya/ballerina-cache/blob/master/src/cache/lru-cache.bal https://www.geeksforgeeks.org/lru-cache-implementation/ https://www.interviewcake.com/concept/java/lru-cache https://stackoverflow.com/questions/2504178/lru-cache-design
Fixes https://github.com/ballerina-platform/ballerina-lang/issues/21051 Fixes https://github.com/ballerina-platform/ballerina-lang/issues/19187 Fixes https://github.com/ballerina-platform/ballerina-lang/issues/19557 Fixes https://github.com/ballerina-platform/ballerina-lang/issues/19487 Fixes https://github.com/ballerina-platform/ballerina-lang/issues/20614 Fixes https://github.com/ballerina-platform/ballerina-lang/issues/21545 Fixes https://github.com/ballerina-platform/ballerina-lang/issues/21302