api3dao / airnode

Airnode monorepo
https://docs.api3.org/
MIT License
165 stars 72 forks source link

Investigate gas estimation with large amounts of gas #1997

Closed dcroote closed 3 months ago

dcroote commented 4 months ago

A user on Discord is reporting gas estimation issues when fulfillment uses a large amount of gas:

for this kind of behavior, you must keep the gas of the fulfillment less than 1.6M. the problem is that; when the estimated gas increases, the estimation gets a little less than what the transaction uses.

I don't think it's you guys problem because i've checked the airnode code and you guys are doing the estimation like everybody else does, but you can just start out by creating an fulfillment that has like 2M gas

bbenligiray commented 4 months ago

As a note, the user reporting this seems to be using base-sepolia-testnet

https://sepolia.basescan.org/tx/0xff1400c4eb0fbcede0b01a456060f3702b0ae4504268d29fdcf44a8359671029#eventlog

tarik0 commented 4 months ago

As a note, the user reporting this seems to be using base-sepolia-testnet

https://sepolia.basescan.org/tx/0xff1400c4eb0fbcede0b01a456060f3702b0ae4504268d29fdcf44a8359671029#eventlog

The problem does not only happen in Base Sepolia, It also happens in Ethereum Sepolia as well.

tarik0 commented 4 months ago
// SPDX-License-Identifier: MIT
pragma solidity ^0.8.20;

import "@api3/airnode-protocol/contracts/rrp/interfaces/IAirnodeRrpV0.sol";
import "@openzeppelin/contracts/access/Ownable.sol";

contract MockedRequester is
    Ownable
{
    address internal _airnodeRrp;
    address internal _airnode;
    address internal _sponsorWallet;
    bytes32 internal _endpointIdUint256;

    event Requested(bytes32 requestId);

    constructor() {}

    function setSettings(
        address airnodeRrp,
        address airnode,
        address sponsorWallet,
        bytes32 endpointIdUint256
    ) external onlyOwner {
        _airnodeRrp = airnodeRrp;
        _airnode = airnode;
        _sponsorWallet = sponsorWallet;
        _endpointIdUint256 = endpointIdUint256;

        IAirnodeRrpV0(_airnodeRrp).setSponsorshipStatus(address(this), true);
    }

    struct Request {
        uint256 fromId;
        uint256 toId;
        uint256 timestamp;
        bool hasReferral;
    }

    struct Response {
        uint256 rawSeed;
        uint256 timestamp;
        bool hasReferral;
    }

    function requestUint256(uint256 tokenIdCount) external onlyOwner {
        bytes32 requestId = IAirnodeRrpV0(
            _airnodeRrp
        ).makeFullRequest(
            _airnode,
            _endpointIdUint256,
            address(this),
            _sponsorWallet,
            address(this),
            this.fulfillUint256.selector,
            ""
        );

        _expectedRequests[requestId] = Request(0, tokenIdCount, block.timestamp, false);

        emit Requested(requestId);
    }

    mapping(bytes32 => Request) private _expectedRequests;
    mapping(bytes32 => Response) private _requestToResponse;
    uint256 private _totalProbability;

    function fulfillUint256(bytes32 requestId, bytes calldata data) external {
        // validate request
        Request memory seedRequest = _expectedRequests[requestId];
        if (msg.sender != _airnodeRrp) {
            revert();
        }

        // decode seed
        uint256 rawSeed = abi.decode(data, (uint256));

        // set request's response
        _requestToResponse[requestId] = Response(rawSeed, seedRequest.timestamp, seedRequest.hasReferral);

        // iterate over token ids
        for (uint256 i = seedRequest.fromId; i < seedRequest.toId; i++) {
           _totalProbability += (rawSeed ^ i) % type(uint32).max;
        }
        delete _expectedRequests[requestId];
    }
}

You can use this contract to test out the fulfillment as I did. This is the mocked version of how my actual fulfillment is happening.

dcroote commented 4 months ago

@tarik0 - what sorts of values are representative for your tokenIdCount?

dcroote commented 4 months ago

@bbenligiray / @Siegrift - I spun up a local Airnode and local hardhat network, deployed the contract (with a few slight variations), and tested increasingly large tokenIdCount values, which are critical to the gas consumption as this value dictates the loop iteration count in the fulfillment function (seedRequest.toId). Gas estimation was fine for values of 500 and 1000 as shown by the following Airnode DEBUG log snippets, respectively:

INFO Gas limit is set to 554767 (AirnodeRrp: 41995 + Fulfillment Call: 512772) INFO Gas limit is set to 955425 (AirnodeRrp: 42019 + Fulfillment Call: 913406)

However, when I tried 2000, it failed due to a timeout, and the source of the error was here:

https://github.com/api3dao/airnode/blob/228d91d7fd08f370dd852787c8eae1174d84aefa/packages/airnode-node/src/evm/fulfillments/api-calls.ts#L188-L196

indicating that there is a timeout occurring when trying to estimate the fulfillment call overhead. Full logs here:

2024-05-21 00:10:14 [2024-05-21 07:10:14.791] DEBUG Attempting to estimate required gas to fulfill API call for Request:0xe26cb0ec67ecbbd1fad28023b6e8715e7725c7f7561abfc8fcac94921e1a1f84... Coordinator-ID:161e98b7836ea821, Chain-ID:31337, Provider:exampleProvider, Endpoint-ID:0xfb87102cdabadf905321521ba0b3cbf74ad09c5d400ac2eccdbef8d6143e78c4, Sponsor-Address:0xf39Fd6e51aad88F6F4ce6aB8827279cffFb92266
2024-05-21 00:10:14 [2024-05-21 07:10:14.791] DEBUG Attempting to estimate AirnodeRrp overhead to fulfill API call for Request:0xe26cb0ec67ecbbd1fad28023b6e8715e7725c7f7561abfc8fcac94921e1a1f84... Coordinator-ID:161e98b7836ea821, Chain-ID:31337, Provider:exampleProvider, Endpoint-ID:0xfb87102cdabadf905321521ba0b3cbf74ad09c5d400ac2eccdbef8d6143e78c4, Sponsor-Address:0xf39Fd6e51aad88F6F4ce6aB8827279cffFb92266
2024-05-21 00:10:14 [2024-05-21 07:10:14.791] DEBUG Attempting to estimate fulfillment call overhead to fulfill API call for Request:0xe26cb0ec67ecbbd1fad28023b6e8715e7725c7f7561abfc8fcac94921e1a1f84... Coordinator-ID:161e98b7836ea821, Chain-ID:31337, Provider:exampleProvider, Endpoint-ID:0xfb87102cdabadf905321521ba0b3cbf74ad09c5d400ac2eccdbef8d6143e78c4, Sponsor-Address:0xf39Fd6e51aad88F6F4ce6aB8827279cffFb92266
2024-05-21 00:10:14 [2024-05-21 07:10:14.791] ERROR Fulfillment call overhead estimation failed for Request:0xe26cb0ec67ecbbd1fad28023b6e8715e7725c7f7561abfc8fcac94921e1a1f84 with Error: Operation timed out Coordinator-ID:161e98b7836ea821, Chain-ID:31337, Provider:exampleProvider, Endpoint-ID:0xfb87102cdabadf905321521ba0b3cbf74ad09c5d400ac2eccdbef8d6143e78c4, Sponsor-Address:0xf39Fd6e51aad88F6F4ce6aB8827279cffFb92266
2024-05-21 00:10:14 [2024-05-21 07:10:14.791] ERROR Error: Operation timed out
2024-05-21 00:10:14     at createGoError (/usr/local/share/.config/yarn/global/node_modules/@api3/promise-utils/build/cjs/index.js:49:30)
2024-05-21 00:10:14     at /usr/local/share/.config/yarn/global/node_modules/@api3/promise-utils/build/cjs/index.js:112:16
2024-05-21 00:10:14     at Generator.throw (<anonymous>)
2024-05-21 00:10:14     at rejected (/usr/local/share/.config/yarn/global/node_modules/@api3/promise-utils/build/cjs/index.js:6:65) Coordinator-ID:161e98b7836ea821, Chain-ID:31337, Provider:exampleProvider, Endpoint-ID:0xfb87102cdabadf905321521ba0b3cbf74ad09c5d400ac2eccdbef8d6143e78c4, Sponsor-Address:0xf39Fd6e51aad88F6F4ce6aB8827279cffFb92266
2024-05-21 00:10:14 [2024-05-21 07:10:14.791] DEBUG Attempting to fulfill API call for Request:0xe26cb0ec67ecbbd1fad28023b6e8715e7725c7f7561abfc8fcac94921e1a1f84... Coordinator-ID:161e98b7836ea821, Chain-ID:31337, Provider:exampleProvider, Endpoint-ID:0xfb87102cdabadf905321521ba0b3cbf74ad09c5d400ac2eccdbef8d6143e78c4, Sponsor-Address:0xf39Fd6e51aad88F6F4ce6aB8827279cffFb92266
2024-05-21 00:10:14 [2024-05-21 07:10:14.791] INFO Submitting API call fail for Request:0xe26cb0ec67ecbbd1fad28023b6e8715e7725c7f7561abfc8fcac94921e1a1f84... Coordinator-ID:161e98b7836ea821, Chain-ID:31337, Provider:exampleProvider, Endpoint-ID:0xfb87102cdabadf905321521ba0b3cbf74ad09c5d400ac2eccdbef8d6143e78c4, Sponsor-Address:0xf39Fd6e51aad88F6F4ce6aB8827279cffFb92266
2024-05-21 00:10:14 [2024-05-21 07:10:14.791] INFO Transaction:0x6a0a56b5188c4b62e7af3ff6b594886e8f38da44cadd3a6f0050588337fc43ae submitted for Request:0xe26cb0ec67ecbbd1fad28023b6e8715e7725c7f7561abfc8fcac94921e1a1f84 Coordinator-ID:161e98b7836ea821, Chain-ID:31337, Provider:exampleProvider, Endpoint-ID:0xfb87102cdabadf905321521ba0b3cbf74ad09c5d400ac2eccdbef8d6143e78c4, Sponsor-Address:0xf39Fd6e51aad88F6F4ce6aB8827279cffFb92266
tarik0 commented 3 months ago

@dcroote The amounts you've entered are almost the same as mine. I also had problems with more than 500 token IDs.

I bet the simulation gets timed out since the transaction flow gets bigger but I don't know why this suggests a gas limit instead of throwing some error.

bbenligiray commented 3 months ago

I offered @tarik0 a 1 ETH bounty if he can prototype a solution that gets rid of the unexpected gas limit estimation discrepancy with larger fulfillment transactions, he's currently looking into it

tarik0 commented 3 months ago

Our RRP fulfill gas limit is determined by summing two eth_estimateGas RPC calls:

  1. Dry Run Simulation
  2. Fulfillment Address Fulfill Call

In theory, this sum should slightly overestimate the actual gas usage, ensuring that the fulfillment won't revert due to an out of gas error. This approach works adequately for fulfillments requiring less than approximately 1.5M gas. However, for more extensive fulfillments, the gas estimation proves insufficient.

Investigation

Initially, we suspected some OPCODE costs were missing from our calculations. Using Tenderly, I analyzed the fulfillment transactions. Upon comparing the dry run transaction with an empty RRP fulfillment transaction, I identified that the calculation missed an SSTORE (2,900 gas) opcode. Nonetheless, this was not the only discrepancy, as the gas difference varied across transactions of different sizes (10K, 20K, 30K).

We considered adding a fixed or dynamic gas buffer but opted for a deeper investigation. To trace the transactions, we employed debug_traceCall and debug_traceTransaction. Due to the absence of an archive Geth node with the debug API enabled, and Alchemy's restriction to the callTracer, we simulated the transactions using a local Hardhat environment that forks the Sepolia testnet.

Findings

After comparing the dry run + fulfill transaction with the airnode rrp fulfill transaction, I discovered an additional gas cost of 3,238. Adding this extra gas to the estimation resolved the issue.

Upon informing @bbenligiray, I commenced testing the extra buffer on Base Sepolia, but it failed due to the missing gas not being entirely accounted for (tested with 10K, 20K, and 30K). This discrepancy may be due to differences in transaction flow computations between Hardhat and the mainnet. The only definitive way to confirm this is by conducting the same tests on a live Sepolia or Base Sepolia archive Geth node that permits debug namespace API calls.

The simplest solution would be to pass the gas limit as a parameter or use a dynamic gas buffer within certain boundaries to prevent the revert issue.

For detailed traces and comparisons, you can refer to:

tarik0 commented 3 months ago

I've encountered an issue where gas estimation for the fulfill function using the RRP contract consistently results in an out of gas revert, even when trying to estimate gas as follows:

airnodeRrp.estimateGas.fulfill(
  request.id,
  request.airnodeAddress,
  request.fulfillAddress,
  request.fulfillFunctionId,
  request.data.encodedValue,
  request.data.signature,
  {
    ...options.gasTarget,
    nonce: request.nonce!,
  }
);

Despite using gas estimation, the transaction still fails with out of gas (see transaction: Sepolia BaseScan). Since the estimated and the actual transaction should have similar operations and gas costs, the issue might lie in the RPC estimation rather than the opcodes themselves.

The best workaround could be implementing a dynamic gas buffer with set boundaries to avoid these out of gas reverts.

tarik0 commented 3 months ago

@bbenligiray Thanks for the bounty opportunity but I'm out of ideas now and think I need to resign the bounty. I've traced the transactions with various tools and chains. I hope this investigation helps you through. I will be watching this issue closely in the future.