phil8192 / ob-analytics

R package intended for visualisation, analysis and reconstruction of limit order book data
Other
149 stars 45 forks source link

Bitstamp's Live Orders API is not reliable for identification of trades #28

Open petr-fedorov opened 6 years ago

petr-fedorov commented 6 years ago

Bitstamp provides APIs for Live Orders and Live Ticker (i.e. trades) as described here . Sometimes Live Orders API outputs incorrect data as shown below.

Here is the trade record received from Live Ticker API: 2018-10-14 12-52-22 The record contains price and amount of trade as well as buy and sell order IDs.

Here is the records from Live Orders API for orders participated in the above trade: 2018-10-14 12-53-03

(both screenshots are taken from PostgreSQL database where the data received from API are saved)

Correct row 6 corresponds to sell order 2269748432. 'fill' column equals to trade amount, timing is ok etc.

Incorrect row 5 corresponds to the buy order. It is incorrect because it reports the cancellation of the buy order for 'amount' column didn't change comparing to the previous event for the order. That is equivalent to zero 'fill' - no trade.

Thus processData() function should probably handle the data from both APIs together to produce reliable results.

phil8192 commented 6 years ago

hi Petr, thanks for pointing this out.

I had actually asked Bitstamp to add maker taker ids in the live ticker api, which they added later on. There are inconsistencies between the two APIs (sometimes trades do not show on the ticker and sometimes there are missing order events). I had to make lots of inferences to derive the trades from the order book data alone. I'd like to refactor the code to take these different options into account - perhaps a combination of the two.

petr-fedorov commented 6 years ago

Yes, Phil, you are right. Here is an example where 'live_orders' channel reports three trades which never appear in 'live_trades' interface:

2018-10-17 19-59-11

Nevertheless I believe we should aim for perfection here and would try to reconcile all data available.

Do you have a typical 'use case' for obAnalytics in mind? Have you got any feedback from the users for the package?

phil8192 commented 6 years ago

I totally agree about perfection. I will integrate both data sources.

The main use case is offline lob data exploration and "market surveillance/forensics'. obAnalytics, as it is now, evolved from some visualisation code I wrote to better understand some experimental indicators I was working on at the time. Ideally, I'd like to:

can start a wiki or add issues labelled as enhancement to develop ideas. thanks Petr.

roderickObrist commented 5 years ago

Hi Guys, I know I'm a bit late to the party, but I've done a fair amount of work regarding inference of Bitstamp events based on the two APIs (live_orders and trades).

The misordering of data and skipping of events was due to their use of Pusher to distribute data.

Now they have their own servers which has made working with the data much eaiser.

However there still are "unexplained drops", which look like trades, but a trade does not get broadcast in the trade stream.

These unexplained drops always have a matching order on the other side that would make a perfect trade, I just don't know why Bitstamp do not broadcast the trade.

My only guess would be self trade prevention, however it seems like 2/3 of the trade matches are STP.

Have you guys got any insights on this behaviour?

phil8192 commented 5 years ago

Hi Roderick, I've not yet checked out the new Websocket API. The Pusher version is a real pain as you've noticed.

Have you seen any reported trades not in the live_orders stream? It used to be the case that both had missing data.

I'll take a look at the new API and find some examples of this happening.

petr-fedorov commented 5 years ago

@roderickObrist @phil8192

I have the following statistics covering period from 2019/01/27 till 2019/04/23 (with some minor interruptions):

Pair # of trades inferred live_trades & live_orders live_orders only live_trades only
BTCUSD 1,874,617 55.17% 44.47% 0.35%
XRPUSD 640,698 87.43% 11.94% 0.63%
ETHUSD 622,859 71.39% 28.26% 0.35%
BTCEUR 428,198 99.02% 0.5% 0.48%
LTCUSD 194,340 91.58% 7.92% 0.50%
BCHUSD 170,218 95.38% 4.29% 0.33%

Since my matching algorithm is rather complicated, I would say that few 'live_trades only' trades are due to deficiencies in the algorithm, i.e. there are actually no trades in live_trades which were not reported in live_orders.

While this presentation argues that "the vast majority of reported bitcoin trading volume is either fake volume or represents non-economic wash trading", currently reported Bitstamp's trade volume for last 24h for BTCUSD on CoinMarketCap for example is very close to live_trades & live_orders. This fact supports the STP hypothesis in my opinion.

roderickObrist commented 5 years ago

Hi Guys,

So I contacted Bitstamp for some clarification, surprisingly responsive.

https://www.reddit.com/r/Bitstamp/comments/bbvut2/bitstamp_api_behaviour/

They also say that it's STP.

My only issue is: GDAX broadcasts STP events as "change". You see 1-5 change events per day and GDAX has about 5 times the traffic that Bitstamp does

I get about 20% of change events as STP in Bitstamp (So depending on the day up to 20,000 STP events).

I'll take Bitstamps word for it, I think they are a very trustworthy exchange.

But there's something I'm not seeing, perhaps there is a bot that is intentionally trading with itself very often on Bitstamp.