FuelLabs / fuel-core

Rust full node implementation of the Fuel v2 protocol.
Other
57.96k stars 2.79k forks source link

Optimize GraphQL "coins to spend" query #2391

Open rafal-ch opened 4 weeks ago

rafal-ch commented 4 weeks ago

The source of this issue is https://github.com/FuelLabs/fuel-core/issues/623.

Initially, the idea was to remove all complex queries from fuel-core and make it part of the indexer's responsibility. But the SDK is deeply integrated with these basic queries, that we need to support them on fuel-core. For that, we need to optimize the following query by aggregating some information in the off-chain worker or adding new indexes to the database:

Extracted from the https://github.com/FuelLabs/fuel-core/issues/1965, which initially covered optimizations for 3 different areas: balances, coins to spend, dry run.

rafal-ch commented 1 week ago

This issue will be delivered in at least 2 PRs:

  1. Add ability to store coins to spend in an sorted way
  2. Leverage the fact that coins are sorted in the selection algorithm

How to achieve pt. 1:

  1. We'll use two DBs/columns:
    1. main_db - the current one
    2. index_db - the new, indexation DB
  2. Use the separate DB/column to build an "reverse lookup index" which will use the RocksDB key sorting capabilities
    1. RocksDB uses lexicographical sort
    2. To maintain the ordering, the integer representing the amount will be converted to big-endian bytes
    3. Each entry will have a key USER|ASSET|AMOUNT, and value is going to be a key from the main_db
      • this will allow querying by USER|ASSET prefix, getting all coins in sorted order
      • TODO [needed, useful?]: Add ability to retrieve only coins "not greater" and/or "not less" than X
  3. To solve the conflicts (for example: user has two coins with same value for the same asset ID), each coin entry will have an unique suffix (utxo_id). This suffix will also be used to quickly exclude coins.
Example DB structure: main_db key: utxo_id amount block_created tx_created_idx asset_id owner
X1 1 3 12 BTC Alice
X2 100 3 10 BTC Alice
X3 10 3 11 BTC Alice
Y1 54321 4 101 LCK Bob
Y2 12345 4 100 LCK Bob
X4 20 3 14 ETH Alice
Corresponding index_db, assuming that the coins were added in the order presented above: key[^1] value
Alice-BTC-0000000000000001-X1 empty
Alice-BTC-000000000000000a-X3 empty
Alice-BTC-0000000000000064-X2 empty
Alice-ETH-0000000000000014-X4 empty
Bob-LCK-0000000000003039-Y2 empty
Bob-LCK-000000000000d431-Y1 empty

[^1]: key is a concatenation of owner, asset_id, big-endian encoded amount, utxo_id (for conflict avoidance and coin exclusion)

How to achieve pt. 2:

  1. We try to fulfill the request using the biggest coins available
  2. If there is not enough value, return empty set
  3. If there is enough value, put the coins into the set
  4. If there are still free slots available (ie. user provided higher limit that the number of coins we found in the point above), fill the remaining slots with dust coins (smallest ones)

Remarks:

  1. Both main coins and dust coins should respect the "excluded" filter
  2. Coins selected as main coins should not be added as dust later in the process

Questions:

  1. Can we specify the same asset multiple times in coins_to_spend query? - Nope - see here.
    • ⚠ The duplicate check is not included in the PoC

The PoC implementation of the above is available here: https://github.com/rafal-ch/coins_to_spend_poc

rafal-ch commented 1 week ago

Current flow:

  1. coinsToSpend() arrives at the GraphQL API
  2. Coins are collected
    1. off_chain DB is queried for "owned coin ids"
      • these are taken from OwnedCoins storage
    2. coin exclusion filter is applied
    3. coins are read from on_chain DB (coins_iter())
      • these are taken from Coins storage
  3. if querying for base_asset_id, the above is repeated for "messages":
    1. off_chain DB is queried for "owned message ids"
      • these are taken from OwnedMessageIds storage
    2. messages are read from on_chain DB (messages_iter())
      • these are taken from Messages storage
  4. random_improve() algo is used to select coins
    • if it cannot satisfy the request, fallback to largest_first() algorithm

New flow:

  1. Introduce AvailableCoinsManager
    • optionally, let it support plugable selection algo
  2. Notify it about every change to Messages and Coins storage
    • AvailableCoinsManager will update the internal state in the off_chain DB indexation cache
    • this should be done on the same DB transaction for consistency
      • TODO: check how Coins and OwnedCoins are kept in sync
  3. coinsToSpend() arrives at the GraphQL API
  4. query AvailableCoinsManager for quick answer based on the internal state