nkaz001 / hftbacktest

A high-frequency trading and market-making backtesting and trading bot in Python and Rust, which accounts for limit orders, queue positions, and latencies, utilizing full tick data for trades and order books, with real-world crypto market-making examples for Binance Futures
MIT License
2.01k stars 395 forks source link

Data preperation #138

Closed sungjinoh closed 2 months ago

sungjinoh commented 2 months ago

Hi @nkaz001,

I'm new to this framework and I've gone through the documentation and prepared the data, but something is wrong.

For example, when I run GLFT Market Making with the data, mid_price_tick and prev_mid_price_tick are both the same, so mid_price_chg is all 0.

I downloaded the data from here and converted it to follow example codes. The data was prepared using the code below, with a buffer size of 200_000_000

buffer_size = 200_000_000
_ = binancefutures.convert(
    "../data/btcusdc_20240803.gz",
    output_filename="../data/btcusdc_20240803.npz",
    buffer_size=buffer_size,
    combined_stream=True,
)
_ = binancefutures.convert(
    "../data/btcusdc_20240804.gz",
    output_filename="../data/btcusdc_20240804.npz",
    buffer_size=buffer_size,
    combined_stream=True,
)
_ = create_last_snapshot(
    ["../data/btcusdc_20240803.npz"],
    tick_size=0.1,
    lot_size=0.001,
    output_snapshot_filename="../data/btcusdc_20240803_eod.npz",
)
_ = create_last_snapshot(
    ["../data/btcusdc_20240804.npz"],
    tick_size=0.1,
    lot_size=0.001,
    output_snapshot_filename="../data/btcusdc_20240804_last.npz",
    initial_snapshot="../data/btcusdc_20240803_eod.npz",
)

For the feed latency data, I referred 4_latency.py code, and I thought this is not the updated, so I modified it as below, but am I wrong?

        order_latency[i].req_timestamp = req_ts
        order_latency[i].exch_timestamp = order_exch_ts
        order_latency[i].resp_timestamp = resp_ts
        order_latency[i].req_ts = req_ts
        order_latency[i].order_exch_ts = order_exch_ts
        order_latency[i].resp_ts = resp_ts

Thank you for your great work.

nkaz001 commented 2 months ago

I fixed it. Thank you.

        order_latency[i].req_ts = req_ts
        order_latency[i].exch_ts = order_exch_ts
        order_latency[i].resp_ts = resp_ts

Regarding the data, could you verify if the mid price is correct, first? You can quickly do this by plotting the mid price.

sungjinoh commented 2 months ago

@nkaz001

This is bid/ask price. image

and, I printed print(mid_price_tick, prev_mid_price_tick).

-0.5 -0.5
-0.5 -0.5
-0.5 -0.5
-0.5 -0.5
-0.5 -0.5
-0.5 -0.5
-0.5 -0.5
-0.5 -0.5
-0.5 -0.5
-0.5 -0.5
-0.5 -0.5
-0.5 -0.5
-0.5 -0.5
-0.5 -0.5
-0.5 -0.5
-0.5 -0.5

I checked arrival_depth, and mid_price_chg. most of arrival_depth values are -inf, and np.where(mid_price_chg != 0) is only 2 data point is not 0.

Can you point out what part I missed?

nkaz001 commented 2 months ago

Since the bid and ask prices in your plot look correct, and mid_price_tick = (best_bid + best_ask) / tick_size / 2.0, I can't identify the problem based solely on the information you've provided.

sungjinoh commented 2 months ago

Since the bid and ask prices in your plot look correct, and mid_price_tick = (best_bid + best_ask) / tick_size / 2.0, I can't identify the problem based solely on the information you've provided.

my issue is that T, T+1 mid_price_tick is the same(mid_price, prev_mid_price_tick). since bid/ask price is correct, so what do I need to check? Could this be an issue with hbt.elapse?

nkaz001 commented 2 months ago

Didn't you say mid_price_tick printing -0.5? How come mid_price_tick always the same as prev_mid_price_tick if best_bid and best_ask are changing correctly? I can't identify the problem based solely on the information you've provided.