Forge gas reporting is misleading due to obfuscation of transaction overhead

emo-eth commented 11 months ago

Component

Forge

Have you ensured that all of these are up to date?

[X] Foundry
[ ] Foundryup

What version of Foundry are you on?

forge 0.2.0 (0ae39ea 2023-12-11T00:27:32.487222000Z)

What command(s) is the bug in?

forge test

Operating System

macOS (Apple Silicon)

Describe the bug

Forge incorrectly (or, depending on your point of view, inconsistently) reports gas usage whenever the EVM incurs a gas refund.

This is due to the fact that gas refunds are capped at 1/5 of total transaction gas usage and credited at the end of the transaction. However, Forge simply subtracts the 21000+calldata fee overhead from the total gas usage (post-refund) when reporting.

For complicated tests that incur gas refunds (ie, tests that use a lot of gas compared to the tx overhead), reported numbers should be ~accurate. Simpler tests, however, can see very "inaccurate" reported numbers. See the comments in the included examples.

My recommendation: Forge should subtract the tx overhead from the Gas context before calculating the total refund, or else completely rethink how gas is reported. IMHO, using the tx-level gas usage for refund calculations, when Forge tests may contain many logical "transactions" is extremely confusing. Quietly including tx-overhead in refund calculations also makes manual gas accounting much more confusing.

// SPDX-License-Identifier: MIT
pragma solidity ^0.8.17;

import {Test} from "forge-std/Test.sol";

contract RefundTest is Test {
    uint256 i;
    // use low-level calls to avoid solc-injected EXTCODESIZE checks
    uint256 constant pauseSelector = 0xd1a5b36f;
    uint256 constant resumeSelector = 0x2bcd50e0;

    function testNoRefundWithPause() public {
        address VM = VM_ADDRESS;
        assembly {
            mstore(0, pauseSelector)
            pop(staticcall(gas(), VM, 0x1c, 4, 0, 0))
            sstore(i.slot, 5)
            mstore(0, resumeSelector)
            pop(staticcall(gas(), VM, 0x1c, 4, 0, 0))
        }
        burnGas(10000);
        // reported: ~13000 (VM_ADDRESS is not warm, and two calls are made)
    }

    function testBadRefund() public {
        address VM = VM_ADDRESS;
        assembly {
            mstore(0, pauseSelector)
            pop(staticcall(gas(), VM, 0x1c, 4, 0, 0))
            sstore(i.slot, 5)
            mstore(0, resumeSelector)
            pop(staticcall(gas(), VM, 0x1c, 4, 0, 0))
        }
        burnGas(10000);
        assembly {
            sstore(i.slot, 0)
        }
        // business logic: ~13000
        // tx overhead: ~21000
        // total: ~34000
        // gas refund: 19900 > (34000 / 5 ) ? 34000 / 5 : 19900
        // reported: ~6200 = ~13000 - ~34000 / 5
    }

    function testNormalRefund() public {
        address VM = VM_ADDRESS;
        assembly {
            pop(staticcall(0, VM, 0, 0, 0, 0))
            sstore(i.slot, 5)
            pop(staticcall(0, VM, 0, 0, 0, 0))
        }
        burnGas(10000);
        assembly {
            sstore(i.slot, 0)
        }
        // business logic: ~35000
        // tx overhead: ~21000
        // total: ~56000
        // gas refund: 19900 > (56000 / 5 ) ? 56000 / 5 : 19900
        // reported: ~23800 = 35000 - (56000) / 5
    }

    function testNoRefund() public {
        address VM = VM_ADDRESS;
        assembly {
            pop(staticcall(0, VM, 0, 0, 0, 0))
            sstore(i.slot, 5)
            pop(staticcall(0, VM, 0, 0, 0, 0))
        }
        burnGas(10000);
        // reported: ~35000
    }

    function testOverhead() public {}

    function burnGas(int256 amount) internal view {
        unchecked {
            uint256 start = gasleft();
            uint256 x;
            while (gasleft() > start - uint256(amount)) {
                x += 1;
            }
        }
    }
}

foundry.toml:

[profile.default]
src = 'src'
out = 'out'
libs = ['lib']
optimizer = true
optimizer_runs = 9_999_999
# via_ir = true
solc_version = '0.8.22'
bytecode_hash = 'none'
evm_version = 'shanghai'

Vectorized commented 11 months ago

Suggestion:

vm.startTxGasMetering(optionalAccessList);

uint txGasUsed = vm.stopTxGasMetering();

plotchy commented 11 months ago

Im not a big user of gas snapshots so don't have much skin in the game, but I do have some thoughts.

For user actions that become transactions, things like cold slots + refunds + accessLists are important to have correct to understand true cost.

using forge script style tx publishing is accurate, but afaik not possible to incorporate directly from your test files.

For developer iteration cycles, convenience and compartmentalized execution is important.

current gas metering within test functions is useful to iterate on snippets and 'pure' logic and may not make sense as txs. Ideally i wouldn't want any dispatch logic + tx overhead costs to increase the magnitude of my costs when iterating on this.

It seems like using forge script in all cases would be most accurate but applied universally could muddy the testing UX since not all tests make sense to perform as transactions. On the other hand, getting gas metering correct within tests seems extremely difficult when accounting for cold/warm + refunds + access lists.

TBH, it's difficult for me to intuit exactly how forge meters gas when things like cold/warm + setUp() + VM calls + test-fn-dispatch exist. I think the proposition here by @emo-eth has the same intention as the simplifying of dispatch logic + tx overhead costs when acting on pure logic. I like the direction, yet think this would add another layer that brings complexity to maintaining forge and muddies the intuition for what gas number is being output in tests. In reality thats probably a worthy tradeoff for better relative comparison.

Evalir commented 7 months ago

Wonder if we can consider this closed with the addition of the --isolate flag?

zerosnacks commented 4 months ago

Tentatively marking as resolved now that we have the --isolate flag that is enabled by default with --gas-report

Feel free to re-open if there are still actionable items here (cc @Vectorized / @emo-eth)

foundry-rs / foundry