Compute multiple indicators in a single pass

popperwin commented 2 years ago

Hi , Thanks for your library. I'm trying to use TA in order to compute multiple indicators as fast as possible. I was wondering if it was possible to compute multiples indicators in a single call - since your library is using Cython, it should be faster than doing it iteratively in python. I'm already using TA inside a process from multiprocess, so I can't use again multiprocessing to do so. Thanks in advance for your reply

trufanov-nok commented 2 years ago

Hi, It may be usefull to read this discussion first: https://github.com/mrjbq7/ta-lib/issues/316

popperwin commented 2 years ago

@trufanov-nok thanks for your reply. I've read the discussion but I'm not sure that TA-Lib RT would help. My goal is not really to do real time (well at some point it will be, but not now). Right now I'm backtesting a strategy on a fixed period. At the beginning of the backtest I compute a lot of indicators for a lot of dataframes, each containing a lot of lines (timeframe 5min) . That's why my need is mostly to do the multiple indicators computation as faster as possible, since I'm doing it iteratively in Python right now, every backtest cost me more than 30 minutes to wait.

trufanov-nok commented 2 years ago

In this case I would not expect that you may gain any performance boost unless you are testing something that makes thousands calls from python to C code. And the only way to improve that would be to write a backtest in C/C++ to cut off python-to-C calls overhaed.

popperwin commented 2 years ago

Ok I get it. Thanks for your reply, I'll try to find another way to make the backtest faster

mrjbq7 commented 2 years ago

Two other options:

1) Use multi threading / multi processing to use all the cores in your computer. That would likely be a relatively big win.

2) Another option would be to build a custom “calculate multiple indicators” function in cython, calling all the ta-lib C functions directly, aggregating the results and returning to python. Then you would remove all the python-to-C-and back delay, but it would still be single threaded. Not sure about the likely win, could be relatively small due to having more time being the actual work being done.

On Jul 17, 2022, at 4:04 AM, popperwin @.***> wrote:

Ok I get it. Thanks for your reply, I'll try to find another way to make the backtest faster

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you are subscribed to this thread.

popperwin commented 2 years ago

@mrjbq7 Thanks for your reply! Unfortunately I can't use option one since I'm already in a subprocess spawn. I was mainly thinking about Option 2 when I opened the issue. Could you tell me how and where to start to build this multiple indicators function? I have never write Cython before

mrjbq7 commented 2 years ago

One example would be combining cython functions together, something like this:

@wraparound(False)  # turn off relative indexing from end of lists
@boundscheck(False) # turn off bounds-checking for entire function
def EMA( np.ndarray real not None , int timeperiod=-2**31 ):
    """ EMA(real[, timeperiod=?])

    Exponential Moving Average (Overlap Studies)

    Inputs:
        real: (any ndarray)
    Parameters:
        timeperiod: 30
    Outputs:
        real
    """
    cdef:
        np.npy_intp length
        int begidx, endidx, lookback
        TA_RetCode retCode
        int outbegidx
        int outnbelement
        np.ndarray outreal
    real = check_array(real)
    length = real.shape[0]
    begidx = check_begidx1(length, <double*>(real.data))
    endidx = <int>length - begidx - 1
    lookback = begidx + lib.TA_EMA_Lookback( timeperiod )
    outreal = make_double_array(length, lookback)
    retCode = lib.TA_EMA( 0 , endidx , <double *>(real.data)+begidx , timeperiod , &outbegidx , &outnbelement , <double *>(outreal.data)+lookback )
    _ta_check_success("TA_EMA", retCode)
    return outreal 

@wraparound(False)  # turn off relative indexing from end of lists
@boundscheck(False) # turn off bounds-checking for entire function
def SMA( np.ndarray real not None , int timeperiod=-2**31 ):
    """ SMA(real[, timeperiod=?])

    Simple Moving Average (Overlap Studies)

    Inputs:
        real: (any ndarray)
    Parameters:
        timeperiod: 30
    Outputs:
        real
    """
    cdef:
        np.npy_intp length
        int begidx, endidx, lookback
        TA_RetCode retCode
        int outbegidx
        int outnbelement
        np.ndarray outreal
    real = check_array(real)
    length = real.shape[0]
    begidx = check_begidx1(length, <double*>(real.data))
    endidx = <int>length - begidx - 1
    lookback = begidx + lib.TA_SMA_Lookback( timeperiod )
    outreal = make_double_array(length, lookback)
    retCode = lib.TA_SMA( 0 , endidx , <double *>(real.data)+begidx , timeperiod , &outbegidx , &outnbelement , <double *>(outreal.data)+lookback )
    _ta_check_success("TA_SMA", retCode)
    return outreal

To something like this, which fuses the calls together in cython, but I suspect it's not a huge win, plus it would be more complicated to maintain:

@wraparound(False)  # turn off relative indexing from end of lists
@boundscheck(False) # turn off bounds-checking for entire function
def EMA_and_SMA( np.ndarray real not None , int timeperiod=-2**31 ):
    """Calculates the EMA and SMA."""
    cdef:
        np.npy_intp length
        int begidx, endidx, lookback
        TA_RetCode retCode
        int outbegidx
        int outnbelement
        np.ndarray outreal
    real = check_array(real)
    length = real.shape[0]
    begidx = check_begidx1(length, <double*>(real.data))
    endidx = <int>length - begidx - 1
    emalookback = begidx + lib.TA_EMA_Lookback( timeperiod )
    outema = make_double_array(length, lookback)
    smalookback = begidx + lib.TA_SMA_Lookback( timeperiod )
    outsma = make_double_array(length, lookback)
    retCode = lib.TA_EMA( 0 , endidx , <double *>(real.data)+begidx , timeperiod , &outbegidx , &outnbelement , <double *>(outema.data)+emalookback )
    _ta_check_success("TA_EMA", retCode)
    retCode = lib.TA_SMA( 0 , endidx , <double *>(real.data)+begidx , timeperiod , &outbegidx , &outnbelement , <double *>(outsma.data)+smalookback )
    _ta_check_success("TA_SMA", retCode)
    return outema, outsma

In quick testing, it shows 10% speedup which is maybe not worth it:

In [1]: import talib as ta

In [2]: def two(c):
   ...:     ema = ta.EMA(c)
   ...:     sma = ta.SMA(c)
   ...:     return ema, sma
   ...: 

In [3]: import numpy as np

In [4]: c = np.random.randn(1000)

In [8]: %time sum(len(two(c)) for _ in range(100_000))
CPU times: user 702 ms, sys: 1.94 ms, total: 704 ms
Wall time: 703 ms
Out[8]: 200000

In [9]: %time sum(len(ta.EMA_and_SMA(c)) for _ in range(100_000))
CPU times: user 609 ms, sys: 2.77 ms, total: 612 ms
Wall time: 610 ms
Out[9]: 200000

mrjbq7 commented 2 years ago

I would guess using all CPU cores on the machine would be a more effective strategy...

TA-Lib / ta-lib-python

Compute multiple indicators in a single pass #531