nkaz001 / hftbacktest

A high-frequency trading and market-making backtesting tool in Python and Rust, which accounts for limit orders, queue positions, and latencies, utilizing full tick data for trades and order books, with real-world crypto market-making examples for Binance Futures
MIT License
1.78k stars 357 forks source link

Demo of arbitrage of bookTicker data across exchanges #126

Closed ffssea closed 3 weeks ago

ffssea commented 3 weeks ago

Thank you so much for your amazing work. We are very interested in arbitrage bookTicker data across exchanges using the rust version of the code. Can you give us some runnable examples?

nkaz001 commented 3 weeks ago

Currently, HftBacktest requires both L2 market depth and trade feeds for backtesting, even though some strategies might achieve sufficiently realistic results using only L1 market depth and trade feeds. There is an ongoing development branch that employs data fusion techniques to integrate multiple depth data sources, including L1, to provide the most up-to-date and the finest information. However, this feature is still a work in progress and remains unstable. For now, it is recommended to conduct backtesting using L2 market depth and trade feeds. Alternatively, you can generate L2 market depth data artificially from L1. For multi-asset backtesting, please see #116.

ffssea commented 3 weeks ago

Why does the error occur when I use depth for gridtrading backtesting. 30ef572bfa47efa0b04ef6e881512fc 9ee1529e2ad9b6bbc821bfa0b6ad82a

A screenshot of the data is as follows: image

We use the following CSV file to compress into npz: ref_BTC_USDT-20240821.csv

nkaz001 commented 3 weeks ago

The issue you raised is that when elapse is invoked, it immediately jumps to the end of the data (i64::MAX timestamp), correct? I don’t have any problems with this. Which version are you using, and how are you inputting the feed data?"

Please see the attached file, which I saved as an npz file. You will need to decompress it once. BTCUSDT_20240821.npz.gz

ffssea commented 3 weeks ago

Thanks for your reply. When using the BTCUSDT_20240821.npz.gz you provided, the problem still exists. Is there an error in my data collection phase? Or is there a bug in the data conversion phase? Here is the code I used, modified from your example. Looking forward to your reply. hft_demo1.zip

nkaz001 commented 3 weeks ago

Which version are you using? I have no problem with the attached npz file.

ffssea commented 3 weeks ago

Sorry, it was my negligence. Here is the version information I used.

[dependencies]
hftbacktest = "0.2.1"
tracing-subscriber = "0.3.18"
serde = { version = "1.0", features = ["derive"] }
serde_json = "1.0"
nkaz001 commented 3 weeks ago

Could you try again using the latest version 0.3.2? There have been many changes, including updates to the data format.

ffssea commented 3 weeks ago

Now the example runs correctly. It would be very grateful if you could share your data processing code.

nkaz001 commented 3 weeks ago

Sure.

import polars as pl
import numpy as np

df = (
    pl.read_csv('ref_BTC_USDT-20240821.csv', has_header=False)
    .with_columns(
        pl.col('column_1').cast(pl.UInt64).alias('ev'),
        pl.col('column_2').alias('exch_ts'),
        pl.col('column_3').alias('local_ts'),
        pl.col('column_4').alias('px'),
        pl.col('column_5').alias('qty'),
        pl.col('column_6').cast(pl.UInt64).alias('order_id'),
        pl.col('column_7').alias('ival'),
        pl.col('column_7').cast(pl.Float64).alias('fval')
    )
    .select(['ev', 'exch_ts', 'local_ts', 'px', 'qty', 'order_id', 'ival', 'fval'])
)
np.savez_compressed('BTCUSDT_20240821.npz', data=df.to_numpy(structured=True))
ffssea commented 3 weeks ago

Thanks for your help, this issue has been resolved. Looking forward to more communication.