man-group / arctic

High performance datastore for time series and tick data
https://arctic.readthedocs.io/en/latest/
GNU Lesser General Public License v2.1
3.05k stars 583 forks source link

check updates for TickStore for real-time minute bar #840

Open darkknight9394 opened 4 years ago

darkknight9394 commented 4 years ago

Arctic Version

1.79.3

Arctic Store

TickStore

Platform and version

Windows 10 PyCharm

Description of problem and/or code sample that reproduces the issue

What i'm working on is to stream real-time minute bar data into arctic database for 500 symbols using tickstore. (The reason why I choose tick store is because I read about it's real-time capabilities.) The question I have is that I want to check for the database updates continuously so that the trading algorithm can act upon the latest bar.

Currently, I am using a very ad-hoc way of looking for the updates below (query_single_ticker.py). The intention is to keep reading the database and only print if the last row's datetime minute index is not the same as the one previously recorded. While this works fine for single ticker, running it for 20+ symbols seem to kill the performance (using a bash script). I am seeking your suggestion how could one read real-time data whenever there's an update.

    if __name__ == "__main__":
      ticker = sys.argv[1]
      previous_last_idx = None
      while (True):
              # Get a library
              tickstore_lib = store['tick_store_example']
              # current_time_est = dt.now().astimezone(pytz.timezone('US/Eastern'))
              current_time_est = dt.now().astimezone(mktz('US/Eastern'))

               last_hour_date_time = current_time_est - timedelta(minutes=2)

               data = tickstore_lib.read(ticker, date_range=DateRange(last_hour_date_time, current_time_est), columns=None)

               # ideally this should only be called if there's update
               if (data.index[-1] != previous_last_idx):
                     print("UPDATE:: TICKER:" + ticker + " CURRENT EST TIME  = " + str(current_time_est) + " " + " LAST ROW STOCK TIME = " + str(
                     data.index[-1].tz_convert(mktz('US/Eastern'))) + " N UNITS =" + str(len(data.index)))

               previous_last_idx = data.index[-1]

Here's a code snippet of how the data gets stored to arctic. Note here the write to arctic is called once per minute whenever the minute bar is ready.

        for listener in self._listeners:
            #listener.process_live_bar(interval_data)

            (tzaware_date_time, ret_dict, ticker) = self.iqfeed_msg_to_artic_line(fields)
            data = pd.DataFrame(ret_dict, index=[tzaware_date_time])

            #store the data into arctic here
            self.tickstore_lib.write(ticker, data)
codesutras commented 3 years ago

This is a problem with even version store as well. The script is behaving weirdly. For me, it doesn't happen regularly. But, I'm able to see the behaviour 3 times out of 5.

My initial finding was that there could be a potential issue with MongoDB. And I've worked on different cluster setup and standalone db. Upgraded MongoDB to the latest version recently. But the problem is still the same.

Multiple ticker updating data get with live stream get hanged which result in no data at runtime.

@bmoscon @jamesblackburn @shashank88 Any idea about this problem ?? Or are we missing something to make it work with multiple tickers storing real-time data.