Orderbook processing latency keep increasing (with weird spike) in time

xingyaoww commented 2 years ago

Describe the bug The latency of processing the order book (e.g., get the first 20 levels for bid and ask) increases as time goes by (especially when the total levels of the order book increase -- similar to #406). But the more problematic issue would be that the latency spikes every few iterations when book levels increase to a certain level (e.g., latency increases to ~350ms around 12,500 depth).

To isolate whether it is caused by some os scheduling problem that swap the process out of CPU, I use isolcpus and taskset (see here) to make sure this is the only one processes that is running on a CPU.

Could it be something related to memory allocation on the orderbook's C implementation (as orderbook grows)? Note that I'm running this test in a small AWS EC2 instance where resources (RAM) are limited (~500M), and I'm not sure if this is sufficient for cryptofeed to process the orderbook (I can also run it on another machine with larger memory to double-check).

Any help on this would be appreciated! I'll be happy to PR when possible.

To Reproduce

Redirect the stdout to a log file and run the following scripts for a few hours.

import numpy as np
import time
from cryptofeed import FeedHandler
from cryptofeed.exchanges import BinanceFutures
from cryptofeed.defines import L2_BOOK
from datetime import datetime
from typing import Deque, Tuple

def parse_streamed_orderbook_to_numpy(orderbook, n_levels: int) -> Tuple[np.ndarray]:
    # price
    bp = np.empty(n_levels, dtype=np.float64) # bp1 bp2 ...
    ap = np.empty(n_levels, dtype=np.float64) # ap1 ap2 ...
    # volume
    bv = np.empty(n_levels, dtype=np.float64)
    av = np.empty(n_levels, dtype=np.float64)

    for i in range(n_levels):
        bid_price, bid_volume = orderbook.book.bids.index(i)
        ask_price, ask_volume = orderbook.book.asks.index(i)
        bp[i], bv[i] = float(bid_price), float(bid_volume)
        ap[i], av[i] = float(ask_price), float(ask_volume)

    return bp, bv, ap, av

async def process_book(book, receipt_timestamp):

    ts = time.time()
    _ = parse_streamed_orderbook_to_numpy(book, n_levels=20)
    te = time.time()
    time_ms = (te - ts) * 1000
    now = datetime.utcfromtimestamp(te)
    print(f"parse_streamed_orderbook_to_numpy delay: {time_ms:.3f} ms @ {now}")

    cur_timestamp_ms = datetime.now().timestamp() * 1000
    book_timestamp_ms = book.timestamp * 1000
    receipt_timestamp_ms = receipt_timestamp * 1000
    recv_delay = receipt_timestamp_ms - book_timestamp_ms
    tot_delay = cur_timestamp_ms - book_timestamp_ms
    print(
        f"* update book: {book.symbol} @ book time {book_timestamp_ms:.3f} ms "
        f"(recv-delay {recv_delay:.3f} ms, tot-delay {tot_delay:.3f} ms)"
        f"\nbook length: bid {len(book.book.bids)}, ask {len(book.book.asks)}"
    )

if __name__ == "__main__":
    # Initialize feed
    f = FeedHandler()
    f.add_feed(BinanceFutures(
        symbols=["ETH-USDT-PERP"],
        channels=[L2_BOOK],
        callbacks={L2_BOOK: process_book},
        retries=-1
    ))
    f.run()

Then use the script to parse the log file and plot:

import seaborn as sns
import numpy as np
import parse

log_file = "PATH TO YOUR LOG FILE"
book_levels = []
delay_in_ms = []
book_ts_in_ms = []
with open(log_file) as f:
    for line in f:
        if line.startswith("parse_streamed_orderbook_to_numpy delay: "):
            parsed = parse.parse("parse_streamed_orderbook_to_numpy delay: {} ms @ {}", line.strip())
            delay = float(parsed[0])
            delay_in_ms.append(delay)

        if line.startswith("* update book: "):
            parsed = parse.parse("* update book: {} @ book time {} ms (recv-delay {} ms, tot-delay {} ms)", line.strip())
            book_ts_in_ms.append(float(parsed[2]))
        if line.startswith("book length:"):
            parsed = parse.parse("book length: bid {}, ask {}", line.strip())
            n_tot_levels = int(parsed[0]) + int(parsed[1])
            book_levels.append(n_tot_levels)
book_levels = np.array(book_levels)
delay_in_ms = np.array(delay_in_ms)
book_ts_in_ms = np.array(book_ts_in_ms)
# Normalize time
book_ts_in_ms = book_ts_in_ms - book_ts_in_ms[0]

sns.relplot(x="book_levels", y="delay_in_ms", data=dict(book_levels=book_levels, delay_in_ms=delay_in_ms), size=0.2)

Expected behavior It is expected to cost additional time when the depth levels grow (hopefully latency increases slower), but those spikes shouldn't be there.

Screenshots See bug description.

Operating System: Linux (Ubuntu 20.04, linux-image-5.15.0-1015-aws)

Cryptofeed Version cryptofeed version 2.2.3, installed from pypi.

xiandong79 commented 2 years ago

cool work.

Besides, this teach me that I should use max_depth=10 when init a feedhandler

xingyaoww commented 2 years ago

I run the same script on a machine with larger RAM, those huge spikes seem to go away. I also increase the RAM of my AWS instance to 1G and try again -- now the latency is within a reasonable range (< 70ms spike for order book with depth 16320+11022=27342).

The result kind of confirmed this issue may be caused by potential memory allocation/page fault (though I think it needs more experiments to actually make the claim).

Also I noticed we can get some decent performance improvement (almost 10x) when switching from orderbook.book.bids.index(i) to iterates the first 20 elements (i.e. price) from orderbook.book.bids.keys(), then do orderbook.book.bids[idx] to fetch the volume. Although I'm not sure why: seems they both are some Py*_GetItem operations (.index use 2 GetItem to fetch key and value separately), and .keys() even need to return a larger slice of keys -- why it is even faster though?

xingyaoww commented 2 years ago

It was resolved by increasing RAM on the server and using the newly implemented .to_list() method (PR here) for the order book.

I was getting around 15~20ms in processing top-20 levels for both bids and asks for an order book size of ~20,000.

bmoscon / cryptofeed

Orderbook processing latency keep increasing (with weird spike) in time #884