Closed richwomanbtc closed 1 year ago
Thank you for your PR! Were you able to identify the issue with the Binance Futures data on Tardis.dev? AFAIK, the snapshot of Binance Futures only has a depth of 1000. https://binance-docs.github.io/apidocs/futures/en/#order-book
Could you please verify how many levels are present in the snapshot data?
It seems that the data of tardis.dev is not something we expected.
I create the dataframe from the npz file created by the modified tardis.covert.
btcusdt = np.load("notebook/btcusdt_20230601.npz")["data"]
ss = create_last_snapshot(btcusdt, tick_size=0.1, lot_size=0.001)
df = pd.DataFrame(ss, columns=["event", "exch_timestamp", "local_timestamp", "side", "price", "qty"])
bid_df = df[df["side"] == 1]
bid_df.nunique()
The result is
event 1
exch_timestamp 1
local_timestamp 1
side 1
price 25382
qty 2927
dtype: int64
The snapshot contains 25382
ticks.
Also, bid_df is like this.
event exch_timestamp local_timestamp side price qty
0 4.0 1.685664e+15 -1.0 1.0 26805.2 24.294
1 4.0 1.685664e+15 -1.0 1.0 26805.1 2.142
2 4.0 1.685664e+15 -1.0 1.0 26805.0 0.102
3 4.0 1.685664e+15 -1.0 1.0 26804.9 5.121
4 4.0 1.685664e+15 -1.0 1.0 26804.8 1.726
... ... ... ... ... ... ...
25377 4.0 1.685664e+15 -1.0 1.0 600.0 1.730
25378 4.0 1.685664e+15 -1.0 1.0 560.0 63.310
25379 4.0 1.685664e+15 -1.0 1.0 557.0 7.017
25380 4.0 1.685664e+15 -1.0 1.0 556.9 0.211
25381 4.0 1.685664e+15 -1.0 1.0 556.8 0.575
This contains too small prices. I don't understand the implementation, but it seems the incremental l2 has no snapshots and larger depth data than rest API.
Also, there is data from exchanges other than binance, so the hard coding of the depth may cause some nasty problems anyway.
Thank you for looking into this. It appears that they are restoring the complete order book snapshot from another source, possibly a redundant server.
Problem
The buffer size for the orderbook snapshot in tardis.py was hard-coded and too small for handling recent data. This issue was discovered while processing the data of 2023/06/01. Here's the code snippet that reproduces the issue:
This code leads to
Solution
This PR introduces an additional parameter ss_buffer_size to tardis.convert function to allow users to set a custom buffer size. The change allows the function to handle a larger data set. The modified code snippet is:
Changes
Added a new parameter ss_buffer_size to the function tardis.convert.
Testing
The code has been tested with recent larger data sets and confirmed to work as expected.