CVF-50, CVF-22: storage packing and other gas optimizations

kmbarry1 commented 3 years ago

This PR is somewhat non-trivial and should therefore get at least two SC team member approvals.

Versus the diffbase, the gas cost of the if (orcl.pass()) orcl.poke; hot path is reduced by 1632 units under the Istanbul gas schedule. This may not seem like a huge amount (the overall cost for the operation is around 90K gas), but it should be worthwhile because:

this adds up over time as there will be multiple copies of this contract being called hourly;
this PR eliminates at least 2 SLOAD operations--the Berlin hardfork increases the cost of SLOAD from 800 to 2100, so after Berlin, the savings is actually significantly greater, and SLOAD costs could increase even more in the future;
the average dollar cost of block space (determined by ETH's USD value and the gas price) is unlikely to decrease but likely to increase based on historical precedent; and,
oracles are critical infrastructure that must work even under conditions of extreme network congestion--even a few hundred gas might be the difference between getting an update through in a timely fashion and negatively affecting all dependent contracts due to stale data.

I believe the Feed structure will be touched by future PRs focused on numerical improvements (particularly increasing the number of bits in the price value), so I have held off on doing any optimizations that rely on the details of that structure.

e18r commented 3 years ago

I don't know what to think about this PR.

On one hand, I think gas improvements, even if small, are increasingly important. I also think every smart contracts engineer should be proficient in Yul.

On the other hand, I wonder if the amount of gas saved makes up for the amount of obscurity this brings to the code. I wonder if the maintainability burden this introduces could have safety and/or security consequences.

This also makes me think more broadly about our way forward with smart contracts development in Maker. So far, we've used Yul for very specific cases. Is this a practice we'd like to start applying in every new contract? Making Yul part of our everyday development activities would help maintaining it. Also, I wonder if there are more effective ways of saving gas, such as the upcoming access lists.

In any case, I'm not opposed to merging this and I think it's safe.

kmbarry1 commented 3 years ago

I don't know what to think about this PR.

On one hand, I think gas improvements, even if small, are increasingly important. I also think every smart contracts engineer should be proficient in Yul.

On the other hand, I wonder if the amount of gas saved makes up for the amount of obscurity this brings to the code. I wonder if the maintainability burden this introduces could have safety and/or security consequences.

This also makes me think more broadly about our way forward with smart contracts development in Maker. So far, we've used Yul for very specific cases. Is this a practice we'd like to start applying in every new contract? Making Yul part of our everyday development activities would help maintaining it. Also, I wonder if there are more effective ways of saving gas, such as the upcoming access lists.

In any case, I'm not opposed to merging this and I think it's safe.

My thoughts:

It's a valid concern that starting to use Yul adds complexity/maintenance burden. And also a valid point that the more it's used, eventually you pass a threshold and it becomes part of the normal workflow.
IMO we are far past the point where any code path that will be heavily used needs ruthless gas optimization, and in particular, "complexity/maintainability" is no longer a valid reason to avoid inline assembly entirely.
Less-frequently called paths can still afford to make a tradeoff between simplicity and efficiency; this is why I only tried to optimize the hot path in this PR, and ignored everything else.
The greater risk from complexity must be balanced by better processes and tools for assurance.
It would be interesting to investigate removing most of the manual optimizations (except for adjusting variable widths and avoiding extra SLOADs for event emission arguments) and see how well solc can do with a high optimize-runs value...if it gets most or all of the benefits added by manual optimization, that's an argument that maybe we can avoid them and just lean on the optimizer.
BUT I trust my explicit optimizations more than whatever solc does under the hood...for each one, I clearly understand why it was done and why it is safe and correct. The solc optimizer has historically been a frequent location of compiler bugs. And you can only optimize a whole contract at a time--so let's say you only care about optimizing foo(); by turning up optimize-runs, you also risk the compiler introducing bugs into bar() and baz(). I also expect that explicit optimizations are more likely to be kept and not accidentally erased by a compiler upgrade or by someone changing the code--optimizers operate on heuristics, and if the heuristics or high-level structure changes, gas cost regressions can occur.

I think I will try the experiment I mentioned above--do the minimal set of hand optimizations and see what the optimizer can do in that case.

kmbarry1 commented 3 years ago

I split the gas test up into two since we'll soon stop calling pass() from the MegaPoker. If Nik confirms that nothing else calls pass() on-chain, I might undo those gas optimizations in favor of readability and just keep the optimizations in poke().

kmbarry1 commented 3 years ago

Regarding turning on the optimizer: it's a fairly significant savings, especially on a contract like this that gets called a lot.

Gas costs for poke (code from commit bcfabf7, Istanbul schedule): No optimizations: 90568 200 runs: 85058 100000 runs: 84874

200 runs seems like a good middle ground--this captures most of the savings with less risk from more aggressive optimization at higher runs values. Concretely, it's a 6% savings (5510 gas). At 200 Gwei gas and $2000 ETH, that equates to a dollar cost reduction of $2.2 per poke. Assuming 10 LP oracles (the current amount), poked once per hour, the yearly savings comes out to $192,720.

NiklasKunkel commented 3 years ago

I split the gas test up into two since we'll soon stop calling pass() from the MegaPoker. If Nik confirms that nothing else calls pass() on-chain, I might undo those gas optimizations in favor of readability and just keep the optimizations in poke().

nothing else calls pass on-chain

NiklasKunkel commented 3 years ago

I understand the trade-off between optimization and readability/maintainability. I agree with the general sentiment that we should prefer explicit optimizations over leaning on the optimizer too much for both already cited reasons:

turning on the optimizer affects the whole contract, not isolated functions, and risks inserting bugs.
the optimizer has a colorful history of bugs.

That being said, the frequency with which poke() is called is only going to increase over time as we onboard more collateral assets into the protocol, and it is unlikely that gas costs decrease don't increase over time. The cumulative gas costs are already in the millions of dollars a year, and likely to hit the tens of millions of dollars a year in the future. Any optimizations we make can lead to huge savings in terms of costs incurred to the protocol. Additionally it enables us to price our Oracles services more competitively. I think Kurt has the right idea here that conservative use of the optimizer w/ 200 runs gives us most of the gas savings without being too aggressive.

kmbarry1 commented 3 years ago

If I remove all inline assembly and keep the optimizer at 200 runs, the gas cost of poke() increases by 1647. At 200 Gwei gas and $2000 ETH, that's $0.65 per poke(); assuming 10 oracles poked hourly for a year, the yearly savings is $57,710. To me, that's enough to keep the hand optimizations in poke.

pass only increases by 74 gas, and since it won't be called on-chain by the newer MegaPoker, I think I'll remove its inline assembly for readability.

kmbarry1 commented 3 years ago

Needs two SC approvals, then can merge.

See previous discussion for rationale behind settling on 200 runs for the optimization setting, and keeping in-line assembly optimizations in poke().

kmbarry1 commented 3 years ago

Rebased on the latest master. The seek precision optimization actually improved the poke gas cost by a nontrivial amount.

makerdao / univ2-lp-oracle

CVF-50, CVF-22: storage packing and other gas optimizations #41