Consider registering individual hook methods (instead of all-or-nothing)

holic commented 1 year ago

frolic — Today at 1:40 PM thinking about store hooks and we should prob move to either 1) separate registration array per hook method (eg set record) or 2) the registration includes a bitmap of all the hook methods used registering a hook right now means every operation goes through that hook, even if it’s a no op?

alvarius — Today at 1:42 PM we could say we can expect a store hook to implement all functions because data can be modified via each of them, it's similar to how an off-chain indexer has to support all store events and can't just ignore one

frolic — Today at 1:43 PM I’m just thinking this is the worst approach for gas in the hot code path

alvarius — Today at 1:44 PM we could do a bitmap if it doesn't add much gas (which might be reasonable since we can store it in the same storage slot) rn i can't think of a use case of only implementing one store hook function but not another though

frolic — Today at 1:45 PM yeah, there’s a trade off of my two proposals: iterate through all hooks but skip some (per store op) vs more registration gas cost (one time)

alvarius — Today at 1:45 PM do you have something in mind?

frolic — Today at 1:45 PM maybe I’m thinking more about system hooks where you may only want before or after could see potentially doing a before and after for store events too but can’t think of a specific use case right now one thought, prob not a great example: a hook that makes an array field append only

alvarius — Today at 1:48 PM we currently do that for field but not for record, because for handling a field update you might want to just read the entire record from storage before and after the update instead of manually reconstructing the new record with the field update applied in the hook could allow this for setRecord too but the usecase is less clear

frolic — Today at 1:50 PM this feels like one of those “I can’t think of a good example but maybe someone will think of one” and we get some potential gas benefits

curious if there’s a strong case for store hooks implementing all methods

alvarius — Today at 1:51 PM curious if there’s a strong case for store hooks implementing all methods just that if you want to react to data changes you probably want to react to all possible ways the data can change i see the point for system hooks though a use case for afterSetRecord might be if you want to actually change the storage value of what has been set

frolic — Today at 2:01 PM I think my main point is: first order effect is better gas (in the hot code path), second order effect is more granularity

alvarius — Today at 2:05 PM yeah i understand the argument for one array in terms of gas would be: if we expect that every hook implements all functions, one array would be cheaper if a table is modified twice in a transaction with different methods (setRecord, setField, deleteRecord, ...) because the storage is hot for every access after the first one but i see advantages for system hooks, and the potential disadvantages for store hooks are small enough to justify making it consistent for both imo

frolic — Today at 2:17 PM this makes sense, at the very least we could couple a “which hooks” bitmap with the address into the same slot

qbzzt commented 1 year ago

@alvrs : a use case for afterSetRecord might be if you want to actually change the storage value of what has been set.

Ori: I agree. I don't like the situation where we can fix some changes, but only revert out of others.

alvrs commented 1 year ago

Action items here:

[ ] Split onSetRecord into onBeforeSetRecord and onAfterSetRecord
[ ] Remove onBeforeSetField and onAfterSetField
[ ] Add onBeforeSpliceRecord and onAfterSpliceRecord
[ ] Split onDeleteRecord into onBeforeDeleteRecord and onAfterDeleteRecord
[ ] Add a bitmap to the store hooks and system hooks: One bit for each operation, so we can skip unnecessary calls.
[ ] Add ability to remove hooks (by filtering the array). Since it's possible to upgrade systems now it seems useful to be able to remove hooks and register new ones.

alvrs commented 1 year ago

Approach A: a single Hooks table with flags for enabling/disabling hooks Approach B: individual lists for each type of hook (i.e. the hooks table changes its schema from { hooks: address[] } to { beforeSetRecord: address[], afterSetRecord: [], ... })

Tradeoffs between approach A and B

Note the cost estimates are disregarding the cost to read the dynamic data length because it's the same for both approaches.

Reading one hook from cold storage costs approximately 2100 gas (cost for cold SLOAD)
Reading three hooks from cold storage costs approximately 4100 gas (cost for two SLOAD since it can be packed in two slots)
Reading one hook from warm storage cost approximately 100 (cold for warm SLOAD)
Reading three hooks from cold storage costs approximately 200 gas (cost for two SLOAD since it can be packed in two slots)
Checking the bitmap for whether a hook is enabled or not costs approximately 150 gas (based on first measurements in #1399, could potentially be optimized)
When spreading the hooks over different arrays based on their type, we always use cold loads because each array is saved in a different storage location
When saving all hooks to a single array we use cold loads for the first time it is loaded, then either only memory cost if it's in a single storage operation or warm storage load cost if it's in separate storage operations but the same transaction.

Example calculations

Scenario 1: We have three subscribers, all of which only care about the before events of the respective operation.
- In approach A we read a the three hooks from storage (4100 gas), and perform the bitmap checks on each of them twice (once for before, once for after, 6x150) = 5000 gas
- In approach B we read the three hooks from storage from the before array (4100 gas), don't need to perform bitmap checks, and don't need to access storage for the after array = 4100 gas
- In this scenario approach B would always be cheaper because there is less overhead of bitmap checks and we don't access unnecessary storage.
Scenario 2: We have three subscribers, all of which care about the before and after event of the respective operation
- In approach A we read the three hooks from storage (4100 gas), and perform the bitmap checks on each of them twice (once for before, once for after, 6x150) = 4100 gas
- In approach B we read the three hooks from storage from two different storage locations (8200 gas) and don't need to perform bitmap checks
- In this scenario we approach A is cheaper in any case (because checking the bitmap twice is much cheaper than loading from a cold storage slot)
Scenario 3: We have six subscribers, three care about the before event and three care about the after event.
- In approach A we read six hooks from storage (8200 gas) and have to check the bitmap twice for each of them (12x150) = 10000 gas
- In approach B we read three hooks from the first array (4100 gas) and three hooks from the second array (4100 gas) and don't check any hooks (8100 gas)
- In this scenario approach B would be always be cheaper because of the missing overhead for checking the bitmap
Scenario 4: We have three subscribers that only care about the before event of the respective operation, but we have a transaction with two storage operations (eg one setRecord and one setField).
- In approach A we load three hooks from storage (4100 gas) and perform the bitmap checks on each of them twice (once for before, once for after, 6x150). In the second storage operation we again load three hooks but now from warm storage (200 gas) and perform the same bitmap checks as before (6x150) = 6100 gas
- In approach B we load three hooks from cold storage in the first operation (4100 gas) and three from still cold storage in the second operation (since it's a different storage slot, 4100) = 8200 gas
- In this scenario approach A would always be cheaper due to the advantage of warm storage slots over cold slots.

Summary

In general, for each hook that cares about both the before and after event approach A is about 1800 gas cheaper than approach B (due to the missing SLOAD, minus the bitmap overhead), while for each hook that only cares about one of before or after approach B is about 300 gas cheaper than approach A (due to the missing bitmap overhead)

If there is a second storage operation on the same table in the same transaction, approach A is around 1700 gas cheaper than approach B (due to a warm SLOAD instead of a cold SLOAD, minus the bitmap overhead).

While I do think in most cases hooks would only care about either before or after and I estimate it to be rather uncommon to have multiple storage operations on the same table in a single transaction, the savings for approach A in those scenarios are much more significant (~6x), and 300 gas overhead is also not that significant for operations that likely cost 20k+ (because they're writing storage), so I would go with approach A.

latticexyz / mud

Consider registering individual hook methods (instead of all-or-nothing) #1159

Tradeoffs between approach A and B

Summary