Closed richard-churchman closed 5 months ago
To kick off the project the first step is to document the key names and patterns (may as well put this into markdown documentation). For example, for a payload entry:
KeyName:TenantId:EntityAnalysisModelId:Key:Value
Where in the Hash Key:
CreatedDateTimestamp:ReferenceDateTimestamp:EntityAnalysisModelInstanceEntryGuid:Bytes
So, selecting in an indexed manner might be:
Payload:1:1:AccountId:123456
And the HSET key has several helpful values such that the HSET can be itterated and filtered for the most common cases with going to the MessagePack byte array:
1713689193:1713689193:dc77732f-c211-435e-82eb-5211d2631105:690
The above values are promoted largely as there is subsequent filtering that will take place by iteration. Bytes are held so it is possible to iterate keys and identify tenants and models which are using the most space, which might be helpful if we ever get around to creating a cloud offer, in which case the Redis and Postgres overhead will need to be charged for.
For every Cache class in the application, start up creating such a key model.
It is supposed that the use of unix Timestamp will be more efficient given the amount of filtering based on date, but will test.
PostgreSQL is used for the purpose of caching transaction data currently, and while it cannot be considered to be an in memory database, the shared buffers means that read performance, while slower than an in memory cache, it is not to an extent that materially affects response time when traded off against the durability guarantees provided by PostgreSQL. Read performance aside, in memory databases are extremely expensive to run, a nightmare to administer and demand a degree of duplication - in Jube at least - given its key value pair access paths (while PostgreSQL queries are indexed on multiple keys, these keys would instead be duplicated, with transaction history being stored in its value HSET).
There is no contest in writes, and Jube response times are severely impacted. For example, reading Abstraction keys overall might take 3ms to 6ms, writing might be 17 ms, which is hard to defend in a real time system. Currently writes to the cache are made in the transaction flow, which is important, as serialisation across requests is required. Ideally all writes would be moved out to background threads performing bulk inserts, but this would not provide for the serialisation guarantees from transaction to transaction (consider a velocity rule made up of counts on the PostgreSQL cache). Turning asynchronous commit on allows some relief, but without moving to UNLOGGED tables (which attract their own problems) it still does not come close to write performance of desirable.
Redis will be implemented as an in memory database as follows.
In respect to the Abstraction cache:
In respect to TTL Counter:
In respect to cached Abstraction values:
The functionality will be optional and in the absence of a Redis cache being available, existing functionality will prevail.
Connection strings to Redis should be contained at the application level and fully support multiple endpoints such that FAILOVER can be invoked to resize Redis instances.