backtrader2 / backtrader

Python Backtesting library for trading strategies
https://www.backtrader.com
GNU General Public License v3.0
234 stars 54 forks source link

High CPU load during pre-market live trading and "adaptive qcheck" logic #9

Open vladisld opened 4 years ago

vladisld commented 4 years ago

For some time already I've paid attention to the high CPU load (100%) during the pre-market trading while running my strategies with IB broker.

Once the regular trading session begins, the load drops to its normal 2-3% - and everything works like a dream.

After short investigation, it appears to be related to the "adaptive qcheck" logic in celerbo._runnext method - the logic which I don't fully understand the rational behind it and would like to get more information about.

TLDR Section:

The celebro._runnext method is a main workhorse of the celebro engine responsible for running the main engine loop, orchestrating all the datas, dispatching all the notifications to the strategies and doing a lot of other work.

Here the part of the _runnext method, relevant for the further discussion:

def _runnext(self, runstrats):

        ###  code removed for clarity ###

        clonecount = sum(d._clone for d in datas)
        ldatas = len(datas)
        ldatas_noclones = ldatas - clonecount

        while d0ret or d0ret is None:
            # if any has live data in the buffer, no data will wait anything
            newqcheck = not any(d.haslivedata() for d in datas)
            if not newqcheck:
                # If no data has reached the live status or all, wait for
                # the next incoming data
                livecount = sum(d._laststatus == d.LIVE for d in datas)
                newqcheck = not livecount or livecount == ldatas_noclones
                print("newcheck {}, livecount {}, ldatas_noclones {}".format(newqcheck, livecount, ldatas_noclones))

            ###  code removed for clarity ###

            # record starting time and tell feeds to discount the elapsed time
            # from the qcheck value
            drets = []
            qstart = datetime.datetime.utcnow()
            for d in datas:
                qlapse = datetime.datetime.utcnow() - qstart
                d.do_qcheck(newqcheck, qlapse.total_seconds())
                drets.append(d.next(ticks=False))

            ### the rest of the code removed for clarity ###

Specifically for datas, the _runnext method calls the data's next method. For live mode - this method should load the next bar from the data's incoming queue, potentially waiting for such bar to arrive if it's not there yet.

And here the code for the part of the '_load' method (eventually called from d.next above) in IBData for example:

def _load(self):
        if self.contract is None or self._state == self._ST_OVER:
            return False  # nothing can be done

        while True:
            if self._state == self._ST_LIVE:
                try:
                    msg = (self._storedmsg.pop(None, None) or
                           self.qlive.get(timeout=self._qcheck))
                except queue.Empty:
                    if True:
                        return None

The high CPU load results if the timeout parameter in the self.qlive.get is 0. In this case the queue.get method will immediately raise the queue.Empty exception and return from the _load function if there is no data in the queue (which is what usually happening during pre-market). The while loop in the _runnext method will then continue to the next iteration and the whole story will repeat itself.

As you may see the timeout parameter gets its value from data's self._qcheck, which is in turn updated in data's do_qcheck method (see the code below) called by _runnext function above.

    def do_qcheck(self, onoff, qlapse):
        # if onoff is True the data will wait p.qcheck for incoming live data
        # on its queue.
        qwait = self.p.qcheck if onoff else 0.0
        qwait = max(0.0, qwait - qlapse)
        self._qcheck = qwait

As you may see, the self._qcheck may either get the value of the qcheck data parameter (passed to the data during its creation) or zeroed

Following the logic in all the above methods, the self._qcheck is zeroed if some of the datas are not yet LIVE - meaning that no bar has been received yet for them - while other datas are already LIVE.

    newqcheck = not livecount or livecount == ldatas_noclones

This is exactly what has happened in my case - from several datas that the strategy is working with, some datas do not receive any bars until right before the trading session starts - all this time the data's self._qcheck is set to zero and celebro's main engine loop is spinning like crazy, crunching my CPU cycles at full speed like there is no tomorrow.

atulhm commented 4 years ago

I’m not familiar with the code but sounds like a loop gone wild issue... if your code can survive it maybe throw in a 5s or 60s sleep in there and see how it behaves.

I’m new to backtrader and one of the items I am looking into is how exactly does BT wait for new data and event the new data.

vladisld commented 4 years ago

This is pretty involved piece of code - and it took me a while to fully understand its peculiarities.

It is not a question of code surviving or not - 100% CPU load is simply not justifiable in this case. In live mode the only mechanism used for timing the incoming bars is qcheck parameter. You can't just add additional sleeps here given a potentially large number of data feeds to process. Especially given that the same code is used for backtesting as well. It is a little bit trickier than that.

The solution I'm using in my fork is to disable some of the logic for calculating the newqcheck and using just the native qcheck values for queue fetching timeout. Will provide the PR once a proper tests will be ready (WIP)

atulhm commented 4 years ago

Yeah. It’s a lot to wrap ones head around. I’m just trying to see if I can help pinpoint the lines the rocket the cpu. Sounds like you’ve found a way. Please bear with me as I get up to speed.

Sent from my iPhone

On May 31, 2020, at 12:07 AM, Vlad Dovlekaev notifications@github.com wrote:

 This is pretty involved piece of code - and it took me a while to fully understand its peculiarities.

It is not a question of code surviving or not - 100% CPU load is simply not justifiable in this case. In live mode the only mechanism used for timing the incoming bars is qcheck parameter. You can't just add additional sleeps here given a potentially large number of data feeds to process. Especially given that the same code is used for backtesting as well. It is a little bit trickier than that.

The solution I'm using in my fork is to disable some of the logic for calculating the newqcheck and using just the native qcheck values for queue fetching timeout. Will provide the PR while a proper tests will be ready (WIP)

— You are receiving this because you commented. Reply to this email directly, view it on GitHub, or unsubscribe.