Open lajarre opened 2 years ago
Hey @lajarre, to your point about
Whenever a new Treasury is added, create a key with the address of the treasury (like treasury_{addr}, with empty string value and an expiry of, eg, 14 days.
What exactly do you mean by an empty string value?
Also, for covalent_transfer_items
, covalent_treasury
, bitquery_eth
and covalent_prices
hashes, wouldn't defining an expiry of one day on those hashes work just as well?
What exactly do you mean by an empty string value?
Our code just need to look up a list of treasury addresses, which it would do with db.keys()
. No need for any specfic value, empty string is good enough.
db.keys()
would look like:
whitelist
treasury_0x0001
treasury_0x1111
treasury_0x1222
...
Also, for covalent_transfer_items, covalent_treasury, bitquery_eth and covalent_prices hashes, wouldn't defining an expiry of one day on those hashes work just as well?
As mentioned, our code expects that all pieces of data need an expiry of 1 day and that they get evicted one by one whenever the instance is short on memory. What I mean by "piece of data", is for example, any of the Redis hash values (think dict value) for the covalent_prices
hash (and not the whole hash). You don't want to evict the whole hash whenever memory runs low.
As mentioned, our code expects that all pieces of data need an expiry of 1 day and that they get evicted one by one whenever the instance is short on memory.
I get that.... but if our scheduler runs everyday on each treasury in our treasury list, meaning that each of the "data" keys are set each day at the same time, then wouldn't it make sense to do something like:
for treasury in treasuries
# get `transfers_for_today` for `asset` which `treasury.address` holds
db.hset("covalent_transfer_items", f"{treasury.address}_{asset}_1_{today}", transfers_for_today)
# do this for all `treasuries`...
tomorrow_at_midnight = (datetime.now() + timedelta(day=1)).replace(hours=0, minutes=0, seconds=0)
db.expireat("covalent_transfer_items", tomorrow_at_midnight)
This would essentially flush the covalent_transfer_items
key everyday at midnight way before the reload_treasuries_stats
task runs (right now this task runs daily at 00:30 UTC).
Note, obviously the code block above is not how its being done in whip, but hopefully it illustrates the point well enough.
By the way, balance data has timestamps that are always set to midnight on a given date.
This would essentially flush the covalent_transfer_items key everyday at midnight way before the reload_treasuries_stats task runs (right now this task runs daily at 00:30 UTC).
This piece of code is a great approach, as it defines an unambiguous and optimized expiry time. That's better than defaulting to the Redis instance eviction rule.
On the other hand, it doesn't help with granularity of eviction (and that's apparent in the code: you are setting expiries after the for loop is done). Our code still expects that each (treasury, asset, day) tuple is be evicted alone when memory runs low. The expectation doesn't play well with the hash structure.
I think I understand what you mean now. I also see what you meant for the solution to the data keys 😅.
From what I understand, the keys for covalent_transfer_items_{treasury_address}_{contract_address}_{chain_id}_{date}
will still have a value containing the "transfer items" and these keys will have an independent expiry time. That works.
There is also another set of data keys that are not addressed in the original post for this issue:
asset_hist_performance
asset_correlations
balances
These keys I think can get away with having a blanket key expiry. Also, on that note, shouldn't we be using these keys???From a high-level perspective I believe all the bugs and improvements listed in this issue are not fulfilled yet, so I'm reopening.
Agreed, forgot about the auto-close 😅
We have the following problems:
treasuries
is often evicted because of memory limits and we are just not making use of it.whitelist
is often evicted and thus whitelists are probably fetch too many times for nothing.covalent_transfer_items
,covalent_treasury
,bitquery_eth
andcovalent_prices
hashmaps are getting expired often, because of memory limits. Whenever one of these hashes is picked by the eviction algorithm, the entirety of the hash is evicted (and not the oldest keys).Let's break things down by kind of data:
Treasuries
Expectations:
Solution:
treasuries
key in a Redis instance that doesn't have a broad eviction rule.treasury_{addr}
, with empty string value and an expiry of, eg, 14 days.Whitelist
Expectations:
whitelist
object regularly, once a day. But not more often, as this would be useless use of bandwidth and create I/O delays.Solution:
whitelist
key in a Redis instance that doesn't have a broad eviction rule (same astreasuries
).sadd
, we could also just use a simple raw text key).Data
For all other pieces of data.
Expectations:
Solution:
covalent_transfer_items
,covalent_treasury
,bitquery_eth
andcovalent_prices
hashes in single entries, likecovalent_transfer_items_{treasury_address}_{contract_address}_{chain_id}_{date}
, etc.