danpaquin / coinbasepro-python

The unofficial Python client for the Coinbase Pro API
MIT License
1.82k stars 737 forks source link

Exchange Time Update Websocket Inconsistency #295

Closed guanm326 closed 5 years ago

guanm326 commented 6 years ago

I've been collecting data for 3 streaming pairs and noticed that some exchange time stamps being sent out for some pairs happened minutes ago.

2018-05-21 17:26:40.429000,ETH-BTC,5,2.96533299,0.0829,0.08291,53.2628,19,,,
2018-05-21 17:26:41.478000,ETH-BTC,5,2.96533299,0.0829,0.08291,53.1928,18,,,
2018-05-21 17:26:41.558000,ETH-BTC,5,2.96533299,0.0829,0.08291,53.21280000,19,,,
2018-05-21 17:26:41.673000,ETH-BTC,5,2.96533299,0.0829,0.08291,53.22280000,20,,,
2018-05-21 17:26:41.685000,ETH-BTC,5,2.96533299,0.0829,0.08291,54.29280000,21,,,
2018-05-21 17:26:43.538000,ETH-BTC,5,2.96533299,0.0829,0.08291,53.22280000,20,,,
**2018-05-21 17:26:43.712000,ETH-BTC,5,2.96533299,0.0829,0.08291,54.29280000,21,,,
2018-05-21 17:26:32.753000,BTC-USD,4,3.926,8395.26,8395.27,1.884027,4,,,**

Similar to the examples, I have a separate thread that fills up the queue with new book/trade updates. Afterwards, my main thread will pull from the other end and write to storage. The ordering of the data in the queue should be consistent on my end, meaning the data received should be chronological at least from the time I receive them untill i process them. My question is why is the last 2 updates in the above example differ in time from the exchange point of view? Anyone else seeing this?

SorenJ89 commented 6 years ago

I have edited my order_book.py example in order to track this. I just save the epoch time of each message when i recieve one and if it is older than the previous message, i print it.

So far i have been running the script for 24+ hours and not caught a single message out of order.

` if name == 'main': import sys import time import datetime as dt from datetime import datetime

class OrderBookConsole(OrderBook):
    ''' Logs real-time changes to the bid-ask spread to the console '''

    def __init__(self, product_id=None):
        super(OrderBookConsole, self).__init__(product_id=product_id)

        # latest values of bid-ask spread
        self._bid = None
        self._ask = None
        self._bid_depth = None
        self._ask_depth = None
        self._msg_time = 0

    def on_message(self, message):
        super(OrderBookConsole, self).on_message(message)
        # Calculate newest bid-ask spread
        bid = self.get_bid()
        bids = self.get_bids(bid)
        bid_depth = sum([b['size'] for b in bids])
        ask = self.get_ask()
        asks = self.get_asks(ask)
        ask_depth = sum([a['size'] for a in asks])

        if 'time' in message.keys():
            epoch_time = (datetime.strptime(message['time'], "%Y-%m-%dT%H:%M:%S.%fZ") - datetime(1970, 1, 1)).total_seconds()
            if self._msg_time-epoch_time>0:
                print(self._msg_time-epoch_time)
        else:
            print(message)

        #if self._bid == bid and self._ask == ask and self._bid_depth == bid_depth and self._ask_depth == ask_depth:
            # If there are no changes to the bid-ask spread since the last update, no need to print
            #pass
        #else:
            # If there are differences, update the cache
            #self._bid = bid
            #self._ask = ask
            #self._bid_depth = bid_depth
            #self._ask_depth = ask_depth
            #print('{} {} bid: {:.3f} @ {:.2f}\task: {:.3f} @ {:.2f}'.format(
            #    dt.datetime.now(), self.product_id, bid_depth, bid, ask_depth, ask))

order_book = OrderBookConsole()
order_book.start()
try:
    while True:
        time.sleep(10)
except KeyboardInterrupt:
    order_book.close()

if order_book.error:
    sys.exit(1)
else:
    sys.exit(0)`
SorenJ89 commented 6 years ago

Ahh.. when looking at your data again, i see the individual books are in order, but BTC-USD is delayed compared to ETH-BTC. Is that the real problem?

guanm326 commented 6 years ago

So I have since switched to a different channel (level2) and capturing single symbol data.

I would say if you need to analyze multiple pairs, it would be nice to know nothing is out of synch. There are times when BTC-USD is ahead/behind by minutes but I do see updates on the gdax website.

On a related note, have you ever had situations when there is a burst of messages and the update is lagged behind?

SorenJ89 commented 6 years ago

So i changed my code and included a method: def get_last_msg_time(self): return self._msg_time_gdax

So now i run threads seperately on each pair, and here is the result (and print the time for last message):

BCH-BTC: 1527447296 LTC-BTC: 1527447267 ETH-BTC: 1527447296 BTC-EUR: 1527447255 BCH-BTC: 1527447297 LTC-BTC: 1527447267 ETH-BTC: 1527447296 BTC-EUR: 1527447255 BCH-BTC: 1527447297 LTC-BTC: 1527447267 ETH-BTC: 1527447297 BTC-EUR: 1527447255 BCH-BTC: 1527447297 LTC-BTC: 1527447267 ETH-BTC: 1527447297 BTC-EUR: 1527447256 BCH-BTC: 1527447297 LTC-BTC: 1527447267 ETH-BTC: 1527447297 BTC-EUR: 1527447256 BCH-BTC: 1527447297 LTC-BTC: 1527447267 ETH-BTC: 1527447297 BTC-EUR: 1527447256 BCH-BTC: 1527447297 LTC-BTC: 1527447267 ETH-BTC: 1527447297 BTC-EUR: 1527447256 BCH-BTC: 1527447297 LTC-BTC: 1527447267 ETH-BTC: 1527447297 BTC-EUR: 1527447256 BCH-BTC: 1527447297 LTC-BTC: 1527447268 ETH-BTC: 1527447297 BTC-EUR: 1527447256 BCH-BTC: 1527447298 LTC-BTC: 1527447268 ETH-BTC: 1527447297 BTC-EUR: 1527447257 BCH-BTC: 1527447298 LTC-BTC: 1527447268 ETH-BTC: 1527447298 BTC-EUR: 1527447257 BCH-BTC: 1527447298 LTC-BTC: 1527447268 ETH-BTC: 1527447298 BTC-EUR: 1527447257 BCH-BTC: 1527447298 LTC-BTC: 1527447268 ETH-BTC: 1527447298 BTC-EUR: 1527447257 BCH-BTC: 1527447298 LTC-BTC: 1527447268 ETH-BTC: 1527447298 BTC-EUR: 1527447257 BCH-BTC: 1527447298 LTC-BTC: 1527447268 ETH-BTC: 1527447298 BTC-EUR: 1527447257 BCH-BTC: 1527447298 LTC-BTC: 1527447269 ETH-BTC: 1527447298 BTC-EUR: 1527447257 BCH-BTC: 1527447299 LTC-BTC: 1527447269 ETH-BTC: 1527447299 BTC-EUR: 1527447257 BCH-BTC: 1527447299 LTC-BTC: 1527447269 ETH-BTC: 1527447299 BTC-EUR: 1527447257 BCH-BTC: 1527447299 LTC-BTC: 1527447269 ETH-BTC: 1527447299 BTC-EUR: 1527447258 BCH-BTC: 1527447299 LTC-BTC: 1527447269 ETH-BTC: 1527447299 BTC-EUR: 1527447258 BCH-BTC: 1527447299 LTC-BTC: 1527447269 ETH-BTC: 1527447299 BTC-EUR: 1527447258 BCH-BTC: 1527447299 LTC-BTC: 1527447269 ETH-BTC: 1527447299 BTC-EUR: 1527447258 BCH-BTC: 1527447299 LTC-BTC: 1527447270 ETH-BTC: 1527447299 BTC-EUR: 1527447258 BCH-BTC: 1527447300 LTC-BTC: 1527447270 ETH-BTC: 1527447299 BTC-EUR: 1527447258 BCH-BTC: 1527447300 LTC-BTC: 1527447270 ETH-BTC: 1527447300 BTC-EUR: 1527447258 BCH-BTC: 1527447300 LTC-BTC: 1527447270 ETH-BTC: 1527447300 BTC-EUR: 1527447258 BCH-BTC: 1527447300 LTC-BTC: 1527447270 ETH-BTC: 1527447300 BTC-EUR: 1527447258 BCH-BTC: 1527447300 LTC-BTC: 1527447270 ETH-BTC: 1527447300 BTC-EUR: 1527447259 BCH-BTC: 1527447300 LTC-BTC: 1527447271 ETH-BTC: 1527447300 BTC-EUR: 1527447259

So i do indeed also see that some pairs are behind others in the timestamp. I wonder why?? Are the different pairs maintained by different PCs/servers which are not synced with respect to time?

SorenJ89 commented 6 years ago

Found this thread.. seems to confirm it: https://www.reddit.com/r/GDAX/comments/8lq0c2/websocket_timestamp_issues/