cowprotocol / services

Off-chain services for CoW Protocol
https://cow.fi/
Other
163 stars 66 forks source link

chore: Improve time to create Auction #2831

Closed fleupold closed 2 days ago

fleupold commented 1 month ago

Background

Updating solvable orders (ie creating a new auction) currently takes >2s with some pretty heavy outliers (logs)

This makes it hard to bring CoW protocol's auction rate down to one batch per block as simply creating up to date state would take >15% of the total time we have at hand. We should at least be able to half this time (if not getting it down even more)

Details

The first step should be to get a better understanding on which of the different parts in the auction creation takes the most time.

It's very likely that this is related to us computing the auction always "from scratch". We should see if we can switch to a more incremental way where only the necessary "on chain state" is queried after the block for the next auction has been processed, but all other things are kept ready and warm throughout the 12s (e.g. whatever we do when a new order is placed should happen as soon as we notice a new order)

Acceptance criteria

p95 time to create a new auction is <1s

sunce86 commented 1 month ago

I looked into this. Here are some numbers for production database.

Orders table has ~3.2M orders overall. OPEN_ORDERS query returns currently solvable/open orders and takes around 3.5s to execute fully.

Two most costly filters in OPEN_ORDERS query are on orders table:

  1. filter by valid_to
  2. filter by status, where the joining with trades table is done to check if the order is fulfilled.

If we start with a simplified OPEN_ORDERS query where we take all 3.2M orders from orders and filter them only by valid_to -> query takes ~300ms. (this reduces 3.2M orders to ~140k orders)

If we add filtering by status on top of ☝️ -> query now takes 2.2s. (this reduces ~140k orders to ~20k orders)

This means that we have to eliminate these two conditions and cache them somehow, so we could work with reduced list of 20k orders. Possible solutions:

  1. Add new field to table orders that will cache status -> it carries an information if order IS DEFINITELY CLOSED. This field wouldn't have to be refreshed on each trade but rather occasionally (once a day for example), and it would serve us to reduce number of orders 3.2M -> 20k together with valid_to which is already a part of orders table.

  2. Create a new table closed_orders that would cache the status information. This is less performant option as we still need to match 140k orders by order id (orders table -> closed_orders table). AFAIS matching 140k rows from two tables adds 200-400ms which is significant.

  3. Split orders table in two tables: closed_orders table (3.18M orders) and maybe_open_orders 🤡 table (20k orders). Have a periodic maintenance function that would move open orders to closed orders. OPEN_ORDERS query would only look in maybe_open_orders so it would be super fast.

  4. Simply create additional table open_orders that serves solely to return solvable/open orders. Whenever a new order is posted, it's inserted into both orders and open_orders tables. open_orders table is periodically filtered and closed orders are simply deleted. So open_orders is a subset (duplicated) data of orders table.

  5. Any other idea?

I think I would be in favor of (4) as simplest and fastest solution.

fleupold commented 1 month ago

I wonder if the btree index on the trades table is sufficient, or if - specifically for this join - we should add another order_uid index?

I wouldn't persist any additional information but simply move to an incremental solvable orders implementation which is managed in memory:

  1. Upon startup we fetch all orders and trades (separately) to build the book of open orders. It's ok if this takes a bit since it only happens once.
  2. We store the last seen timestamp for orders, and last seen block number for trades
  3. When refetching, we only query orders with creation/cancellation timestamp > last seen and trades with block > last seen
  4. From this information we build an updated solvable orders table, and update the last seen timestamps/block numbers

We might need some additional btree indices for efficient range queries but I believe this should speed up things quite significantly.

squadgazzz commented 2 weeks ago

Before submiting a PR, just wanted to clarify the idea.

Upon startup we fetch all orders and trades (separately)

I assume only all solvable orders are implied here since the orders table size is about 4gb. Also, I don't think trades need to be fetched on the startup. The last block number used from the trades table could be added to the original query. Having all of those, we have a solvable orders cache, then fetch new orders and new trades. Based on the new trades data, remove non-relevant orders from the cache and add new ones. Would that work?

fleupold commented 2 weeks ago

You could potentially even use the trade events that we index in the autopilot directly to update the cache (instead from first writing and then reading to disk), but I'd be surprised if this is necessary. Generally I think your idea sounds good.