Closed Dzeranov closed 1 month ago
After discussing different options and potential trade-offs, we decided to go with the next set of changes:
jobs:assigned:<user_wallet_address>
to be jobs:assigned<user_wallet_address>:<oracle_address>
, thus reducing the size of data stored per item (access pattern on UI is bound to oracle address, so it's fine); however still store all items there.env
; default is 45 days) before saving new value in Redis; filtering should be based on updated_at
field since it indicates that some job entered its final statusRetention policy of 45 should allow us to reduce the number of stored items to some reasonable value. We could expect up to 10K assignments per key, but with our inputs it means working with up to 3MB of data in memory and filtering + sorting of such array takes up to 20ms, which is affordable. In case if in the future we start to see some performance degradation in case of many concurrent users - we can reduce retention period and migrate to another storage engine, but only in case if it's really needed.
We could use Redis hashes with jobs:assigned<user_wallet_address>:<oracle_address>
key, assignment_id
as fields and assignment JSON as value and set expiration per field, but expiration for field available only starting with 7.4.0 version of Redis and in Render we can't deploy that version.
We've considered different ways of changing the data model, but none of them works for good.
1. Using "Sorted Sets"
By using sorted sets (with e.g. expires_at
as a score) we could reduce the amount of data we retrieve into app memory, but we would need to implement some expiration mechanism anyways to not exceed the set size.
Also this approach increase complexity of implementation wrt filtering/sorting (or removing some functionality from UI)
2. Grouping items also by some extra field, e.g. by job status jobs:assigned<user_wallet_address>:<oracle_address>:<job_status>
We still have to thing about expiration mechanism, but for more than 1 key. + implementation is unnecessarily complex
3. Storing "flat" items, i.e. jobs:assigned<user_wallet_address>:<oracle_address>:<assignment_id>
With this approach we could store and expire data easily, but retrieving pages of data is tricky: we would need to use SCAN
command that matches specific pattern for keys and scans all keys. It's not recommended for use in production due to performance reasons.
it would be better for us to use different storage engine that can store all assignments in “flat” format (i.e. one item/record per assignment object) and perform all necessary filtering/sorting operations on the engine level, so our application will need to only send optimized queries. Pros
Cons
@dnechay just wanted to point out that the assignment_id is unique per oracle, so for example we might have several oracles with assignment_id=1
Merged, not yet in production
Description In order to avoid unnecessary stress on Exchange Oracle we keep user’s job assignments data in human-app using Redis. Current implementation uses “String” data type (which is basic K-V structure), where key is
jobs:assigned:<user_wallet_address>
and value is JSON array of all user’s assignments for all statuses (i.e. active, expired, completed). When we need to display some portion of data for the “My Jobs” tab on the UI,human-app/server
reads the whole JSON from Redis by its key into memory, then does in-memory filtering (assumed, not implemented yet) and sorting and responds with a prepared page of data to the client.Assigned jobs have different statuses, and if for “active” jobs we don’t expect to have lots of items, then for statuses like “completed” and others the size of stored data (i.e. history log) will grow over time and if we keep storing the full history under single Redis key then at some point in the future we will have different issues:
Motivation Current approach is doomed to be ineffective due to the constantly increasing size of the data stored in a key thus causing performance issues.
Implementation details We want to change the way we store assignments in order to eliminates aforementioned issues or alleviates them for a long period of time with the option of revisiting it in the future if needed.
However, there are potential pitfalls that need to be clarified first, so we will check different options.