Proof of concept of TA-Lib RT python wrapper

TA-Lib / ta-lib-python

Python wrapper for TA-Lib (http://ta-lib.org/).

http://ta-lib.github.io/ta-lib-python

Other

9.7k stars 1.76k forks source link

Proof of concept of TA-Lib RT python wrapper #316

Open trufanov-nok opened 4 years ago

trufanov-nok commented 4 years ago

Hi, I'm the maintainer of TA-Lib fork that I made almost 5 years ago. Recently I renamed it to TA-Lib RT which supposed to mean Real Time. The main idea was to add functionality that allows to "pause" indicator computation and continue when new data arrives. So it can be used for actual data that arrives at real time from trading system and efficiently update indicators. The description could be found here.

This wrapper has experimental Streaming API that supposed to do the same, but I believe it's incorrect as almost all functions have memory effect when process input arrays.

Some time ago I've decided to try to adapt this project to wrap around my fork. I've forked it: https://github.com/trufanov-nok/ta-lib-py-wrapper and spend few weekends on this. The problem is that - I don't know Python :). But I've managed to make it work as a proof of concept. So now wrapper have access to state functions and this (code from perf_talib.py):

    import talib

    talib.MA(data)
    talib.BBANDS(data)
    talib.KAMA(data)
    talib.CDLMORNINGDOJISTAR(data, data, data, data)

could be rewritten as:

import talibrt

res, ma_state = talibrt.MA_StateInit()
res, bbands_state = talibrt.BBANDS_StateInit()
res, kama_state = talibrt.KAMA_StateInit()
res, star_state = talibrt.CDLMORNINGDOJISTAR_StateInit()

for _ in range(LOOPS):
    for d in data:
        res, val = talibrt.MA_State(ma_state, d)
        res, val1, va2, val3 = talibrt.BBANDS_State(bbands_state, d)
        res, val = talibrt.KAMA_State(kama_state, d)
        res, val = talibrt.CDLMORNINGDOJISTAR_State(star_state, d, d, d, d)

res = talibrt.MA_StateFree(ma_state)
res = talibrt.BBANDS_StateFree(bbands_state)
res = talibrt.KAMA_StateFree(kama_state)
res = talibrt.CDLMORNINGDOJISTAR_StateFree(star_state)

And then I found out that calling C functions in a tight loops has a huge performance cost. It was ~60 times slower. So I've decided to try to move these loops on a C side by providing a _BatchState() functions which accept the arrays as regular talib function as well as the _state and simple iterate data while passing it to _State() function. They can be considered as a cached way to process new data. I've generated such functions and add them to TA-Lib RT. So now performance time may look like:

res, ma_state = talibrt.MA_StateInit()
res, bbands_state = talibrt.BBANDS_StateInit()
res, kama_state = talibrt.KAMA_StateInit()
res, star_state = talibrt.CDLMORNINGDOJISTAR_StateInit()

for _ in range(LOOPS):
    res, val = talibrt.MA_BatchState(ma_state, data)
    res, val1, va2, val3 = talibrt.BBANDS_BatchState(bbands_state, data)
    res, val = talibrt.KAMA_BatchState(kama_state, data)
    res, val = talibrt.CDLMORNINGDOJISTAR_BatchState(star_state, data, data, data, data)

res = talibrt.MA_StateFree(ma_state)
res = talibrt.BBANDS_StateFree(bbands_state)
res = talibrt.KAMA_StateFree(kama_state)
res = talibrt.CDLMORNINGDOJISTAR_StateFree(star_state)
t1 = time.time()

And it's only ~4 times slower than original code.

I've stopped further experiments on this "Proof of concept" stage and pushed the code to public to find out if it can be useful for someone with such performance hit at all. Bcs it will depend on how big is your "cache" and how often you need to update your indicator. It's a tradeoff that I can't evaluate as all my code is in C/C++/Qt. Also, as i don't know python, it turns to be quite non-productive.

So, I didn't made the abstract interface that perhaps could hide StateInit and StateFree functions from user and call them automatically. I didn't hide pointers to state data allocated memory in some kind of python's PyCapsule. So it just converted to integer and back. I didn't wrap functions that allows to save and load states to files as I don't know how to open POSIX FILE* in python. And I tested the wrapper only under Python3 on Linux machine.

If anyone found this interesting I'm ready to discuss this, contribute the code or adjust C code on library side if needed. But I would rather not torture myself with python anymore.

mrjbq7 commented 4 years ago

Hi @trufanov-nok ! That's pretty cool!

It's a little unclear to me what kind of future the upstream TA-Lib has, or who really maintains it. Maybe it needs a fork and some real bug fixes and feature improvements like you're making here.

trufanov-nok commented 4 years ago

Official sources are maintained at https://sourceforge.net/p/ta-lib/ by TA-Libs author: Mario Fortier, but the last commit was on 2013-04-03 and Mario has no activity on sourceforge since 2016-02-07. I think he just de-prioritized the project. Don't know if he has any plans.

I can't say that the original code has any serious bugs. I'm answering ta-lib related questions on StackOverflow for last few years and most problems are results of misusing or misunderstanding math behind the indicator, misunderstanding the math behind Google or Yahoo Finance indicators which are often compared to TA-Lib's, misunderstanding of how historical period affects indicator, misunderstanding TA-Lib's Loockback API or problems with python wrapper installation.

There only one issue that I can consider to be a non-fixed bug in original code. "TA_CDL3OUTSIDE, CDL3BLACKCROWS, TA_CDLENGULFING seems to ignore first candle in input data". issue 99. There are also few non-critical issues.

So I didn't made any changes in old logic in my fork, except for a few cases. I just added a new functions to the old ones. Plus, I've added following indicators: ACCBANDS, AVGDEV, IMI, NVI, PVI, PVT. I would even contribute this code to original lib if Mario find it acceptable.

But apart from that the original lib is quite solid and doesn't require any serious changes.

pnmartinez commented 4 years ago

Hello everyone!

I am trying to replicate @trufanov-nok 's results (I am benchmarking different TA-lib implementations), but I have trouble installing ta-lib-py-wrapper.

A bit of context: I am running a conda environment which has the original, mrjbq7/ta-lib installed and working.

When I run python3 setup.py install I obtain:

setup.py:79: UserWarning: Cannot find ta-lib-rt library, installation may fail.
  warnings.warn('Cannot find ta-lib-rt library, installation may fail.')
running install
running bdist_egg
running egg_info
writing TA_Lib_RT.egg-info/PKG-INFO
writing dependency_links to TA_Lib_RT.egg-info/dependency_links.txt
writing requirements to TA_Lib_RT.egg-info/requires.txt
writing top-level names to TA_Lib_RT.egg-info/top_level.txt
reading manifest file 'TA_Lib_RT.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no files found matching 'talib/*.c'
warning: no files found matching 'talib/*.pyx'
writing manifest file 'TA_Lib_RT.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
skipping 'talibrt/_ta_lib.c' Cython extension (up-to-date)
building 'talibrt._ta_lib' extension
gcc -pthread -B /root/anaconda3/envs/talib/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/root/anaconda3/envs/talib/lib/python3.7/site-packages/numpy/core/include -I/usr/include -I/usr/local/include -I/opt/include -I/opt/local/include -I/root/anaconda3/envs/talib/include/python3.7m -c talibrt/_ta_lib.c -o build/temp.linux-x86_64-3.7/talibrt/_ta_lib.o
talibrt/_ta_lib.c:766:10: fatal error: ta-lib-rt/ta_defs.h: No such file or directory
 #include "ta-lib-rt/ta_defs.h"

I have already checked gcc being installed, libdev and other dependencies like those... But it seems like the setup.py is not able to find the original TA-lib anywhere, though I clearly installed it.

trufanov-nok commented 4 years ago

I guess I should retitled it as ta-lib-rt-py-wrapper instead of ta-lib-py-wrapper. Because it's a wrapper around TA-Lib RT (forked) library, not the original TA-Lib library. So you're getting fatal error: ta-lib-rt/ta_defs.h: No such file or directory because TA-Lib RT binary and headers arn't installed on your PC. And this is not depends on original TA-Lib binary - they both may be installed in parallel.

To install TA-lib RT on linux machine just do:

git clone https://github.com/trufanov-nok/ta-lib-rt.git ta-lib-rt
cd ta-lib-rt/ta-lib
mkdir build && cd build
cmake ..
make
sudo make install

It may be a bit harder for Windows. upd: added sudo make install step

pnmartinez commented 4 years ago

Hello @trufanov-nok , thanks for the clarification.

I've gone through your steps and now the logs seem to locate the library (look at the first line, it has changed: now it does find it), but the bottom part of the error is still there.

running install
running bdist_egg
running egg_info
writing TA_Lib_RT.egg-info/PKG-INFO
writing dependency_links to TA_Lib_RT.egg-info/dependency_links.txt
writing requirements to TA_Lib_RT.egg-info/requires.txt
writing top-level names to TA_Lib_RT.egg-info/top_level.txt
reading manifest file 'TA_Lib_RT.egg-info/SOURCES.txt'
reading manifest template 'MANIFEST.in'
warning: no files found matching 'talib/*.c'
warning: no files found matching 'talib/*.pyx'
writing manifest file 'TA_Lib_RT.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
running build_ext
skipping 'talibrt/_ta_lib.c' Cython extension (up-to-date)
building 'talibrt._ta_lib' extension
gcc -pthread -B /root/anaconda3/envs/talib/compiler_compat -Wl,--sysroot=/ -Wsign-compare -DNDEBUG -g -fwrapv -O3 -Wall -Wstrict-prototypes -fPIC -I/root/anaconda3/envs/talib/lib/python3.7/site-packages/numpy/core/include -I/usr/include -I/usr/local/include -I/opt/include -I/opt/local/include -I/workspace/talib_tests/ta-lib-rt/ta-lib-rt/build/lib -I/root/anaconda3/envs/talib/include/python3.7m -c talibrt/_ta_lib.c -o build/temp.linux-x86_64-3.7/talibrt/_ta_lib.o
talibrt/_ta_lib.c:766:10: fatal error: ta-lib-rt/ta_defs.h: No such file or directory
 #include "ta-lib-rt/ta_defs.h"
          ^~~~~~~~~~~~~~~~~~~~~
compilation terminated.
error: command 'gcc' failed with exit status 1

Do you have any theories and what can be happening now?

trufanov-nok commented 4 years ago

Sorry, I've copy-pasted build steps without some adjustments. After make step, please execute sudo make install (in ta-lib-rt/ta-lib folder) - this will finalize installation by copying library binary and it's headers to the proper system folders where they can be found by python installer later. sudo is required for access to /usr/local/ subfolders. You should be able to find ta-lib-rt subfolder in /usr/local/include/ or /usr/include/

nardew commented 4 years ago

Hi @pnmartinez, are you benchmarking strictly ta-lib wrappers or also other python (realtime) technical analysis library? Actually I spent some time researching python implementations of realtime/incremental technical analysis libraries and the conclusion was there is actually none fitting my needs. That motivated creation of talipp package which I would be quite interested about how it performs when compared to realtime/incremental forks/wrappers such as @trufanov-nok's ta-lib-py-wrapper or others being part of your analysis. If your benchmark focuses also on other than pure ta-lib wrappers and you would be willing to add another contestant into comparison, that would be really interesting.

pnmartinez commented 4 years ago

@nardew I will try this! Thank you.

nardew commented 4 years ago

@pnmartinez I added a comparison between talipp and ta-lib for batch and incremental operations. As expected, ta-lib outperforms talipp for vector calculation whereas talipp beats ta-lib for incremental input. The graphs are on the talipp's main page.

trufanov-nok commented 4 years ago

@nardew the Batch functions supposed to be a compromise for ta-lib's incremental and vector calls from python wrapper as python to C aren't fast enough. May you estimate the size of batch needed for ta-lib to become equal to talipp's performance?

mdukaczewski commented 4 years ago

@trufanov-nok In _func_obj_mapping(https://github.com/trufanov-nok/ta-lib-py-wrapper/blob/master/talibrt/abstract.py#L9) I see only _State functions, no _BatchState functions. How it works?

trufanov-nok commented 4 years ago

@mdukaczewski I suppose I was too lazy or couldn't (most probably) add them to abstract. But they are available directly via globals: https://github.com/trufanov-nok/ta-lib-py-wrapper/blob/master/talibrt/__init__.py#L81 The example is in: https://github.com/trufanov-nok/ta-lib-py-wrapper/blob/master/tools/perf_talib.py They are implemented on a C-side of the lib as the auto-generated code for each indicator in form of:

TA_LIB_API TA_RetCode TA_[INDICATOR]_BatchState( struct TA_[INDICATOR]_State* _state, int startIdx, int endIdx, const double    inData[], int    *outBegIdx, int    *outNBElement, double      outData[] )
....
for (int i = startIdx; i <= endIdx; ++i, outIdx++) {
      retValue = TA_[INDICATOR]_State( _state, inData[i], &outVal);
       if ( retValue == TA_SUCCESS) ) {
          outData[outIdx] = outVal;
       } else if ( retValue == TA_NEED_MORE_DATA) ) {
          outData[outIdx] = NAN;
       } else {
          break;
       }
}
...

For example ACCBANDS.

nardew commented 4 years ago

@trufanov-nok I checked and here are the numbers where talipp starts consistently leading (for incremental output):

SMA: ~1,3k values TEMA: ~3k values StochRSI: ~1,7k values

Of course for indicators only with a local context (such as SMA) you can safely prune the input data to the minimum required size. For indicators with a global context (such as EMA) this is not possible anymore. Potential optimization is to accept deviation in the calculation and calculate indicators with global context only from recent values. It will come down to the use case dictating what aspect has the priority (speed vs. precision).

trufanov-nok commented 4 years ago

@nardew that's not exactly what i would like to find out. ta-lib has no incremental API at all. That's ta-lib-rt - a fork of ta-lib that introduces such API. And as it's a C library that accessed via wrapper from python/cython ta-lib-rt incremental API gets even bigger additional overhead. My estimation is that access to ta-lib-rt incremental API from Python is ~60 times slower than ta-lib-rt C incremental API as it goes from python to C and back via wrapper each incremental call. I wonder why it's not linear on your graph as overheads are constant.

I think by Batch API you meant regular API of ta-lib and ta-lib-rt. This API is the same in ta-lib and its fork ta-lib-rt. It's a regular functions TA_[Indicator](). Let's keep refering to it as a Batch API. And Incremental API is TA_[Indicator]State() functions. When faced with Python/Cython overheads I introduced a 3rd type of API that I refer as Batch bcs its functions are TA_[Indicator]BatchState(). Let's rename it to Batch-Incremental API bcs it's actually the combination of both. The TA_[Indicator]BatchState() accepts input and output data vectors as well as _State value. It doesn't initialize or free State value - that shall be done with functions of a regular Incremental API. And what `TA[Indicator]BatchState()actually do is to callTA_[Indicator]State()` for each data value, passing _State and keeping results in output array. So it's a tight loop, but this loop is implemented on a C-side, not in Python. So with this API I'm bypassing Python-to-C call overhead.
(You probably didn't notice that 3rd API bcs as mdukaczewski pointed out higher I didn't add access to it via abstract layer, bcs it's too difficult (i don't know python). It may be accessed via module globals directly. I posted a link to example above).

So if we consider a 10000 data values the Batch API will be almost equal to call to Batch-Incremental API with single batch of 10000 values. And Incremental API will be almost equal to call to Batch-Incremental API that's used 10000 times with batch size = 1 value. Considering your graphs the ta-lib/ta-lib-rt's Batch API outperforms talipp's Incremental API which is no surprise bcs it's a C-code and only one call from Python to C. And ta-lib-rt's Incremental API (on python) is much slower which is no surprize either, bcs the Python-to-C call overhead is huge - all performance wasted bcs of that glue. Now if we consider Batch-Incremental API - with huge batch it'll outperform talipp 's and with small it will loose. The question is - how small can be size of batch? When it cross the talipp's line? This would allow to estimate how big shall be a incoming data flow to leave wrapped ta-lib-rt's Incremental API no chance (on Python via its wrapper) to be as fast as pure python indicator's implementation.

mdukaczewski commented 3 years ago

@trufanov-nok many of the ta-lib indicators have additional parameters (like timeperiod), but when I try to use it with ta-lib-rt, there is an issue:

res, val1, va2, val3 = talibrt.BBANDS_State(bbands_state, d, 10) File "talibrt/_func.pxi", line 1774, in talibrt._ta_lib.BBANDS_State TypeError: BBANDS_State() takes exactly 2 positional arguments (3 given)

Your _State functions don't support optional arguments, right?

trufanov-nok commented 3 years ago

@mdukaczewski optional arguments are set when you initialize the state. So they should be passed to _StateInit() function. They are wrapped inside state object.