TA-Lib / ta-lib-python

Python wrapper for TA-Lib (http://ta-lib.org/).
http://ta-lib.github.io/ta-lib-python
Other
9.49k stars 1.74k forks source link

Binary packages, cleanups. #500

Open TkTech opened 2 years ago

TkTech commented 2 years ago

Didn't mean to make this PR against upstream just quite yet, ah well. WIP.

Changes:

mrjbq7 commented 2 years ago

Cool! Let me know when you want me to review this.

TkTech commented 2 years ago

Almost ready, just some fixes for m1 macs to go and to re-add the upload to pypi I removed while testing.

This builds binary packages for py3.7-310 on arm64-linux, arm64-m1, Power-linux,, x86_64 win/mac/linux.

mrjbq7 commented 2 years ago

That's pretty cool. What controls when it uploads? Is it tagged releases only?

TkTech commented 2 years ago

I'm using my same workflow modified from pysimdjson. This pr is currently missing half the workflow because I don't want it to actually upload while I'm testing :)

When you push a tag that starts with v+number it'll make a release, push it to pypi, and create a release on GitHub adding all the build artifacts to it. Then you can just add a description.

mrjbq7 commented 2 years ago

So far my convention has been tagging, for example TA_Lib-0.4.24, maybe we can have it test for that

mrjbq7 commented 2 years ago

Btw, thank you for the contributions!

TkTech commented 2 years ago

It's just a regex, easy change.

TkTech commented 2 years ago

@mrjbq7 the tests look like they're finding some precision issues on i686:

 ============================= test session starts ==============================
  platform linux -- Python 3.7.12, pytest-7.0.1, pluggy-1.0.0
  rootdir: /project
  collected 40 items / 1 skipped / 39 selected

  ../project/tests/test_abstract.py ................                       [ 40%]
  ../project/tests/test_func.py ..........F...F..                          [ 82%]
  ../project/tests/test_pandas.py ..                                       [ 87%]
  ../project/tests/test_stream.py .....                                    [100%]

  =================================== FAILURES ===================================
  _________________________________ test_BBANDS __________________________________

  series = array([ 91.5 ,  94.81,  94.38,  95.09,  93.78,  94.62,  92.53,  92.75,
          90.31,  92.47,  96.12,  97.25,  98.5 , ... 109.25,
         107.  , 109.19, 110.  , 109.2 , 110.12, 108.  , 108.62, 109.75,
         109.81, 109.  , 108.75, 107.87])

      def test_BBANDS(series):
          upper, middle, lower = func.BBANDS(
              series,
              timeperiod=20,
              nbdevup=2.0,
              nbdevdn=2.0,
              matype=talib.MA_Type.EMA
          )
          i = np.where(~np.isnan(upper))[0][0]
          assert len(upper) == len(middle) == len(lower) == len(series)
          # assert abs(upper[i + 0] - 98.0734) < 1e-3
  >       assert abs(middle[i + 0] - 92.8910) < 1e-3
  E       assert 0.0010000000000047748 < 0.001
  E        +  where 0.0010000000000047748 = abs((92.89 - 92.891))

  /project/tests/test_func.py:116: AssertionError
  ___________________________________ test_RSI ___________________________________

      def test_RSI():
          a = np.array([0.00000024, 0.00000024, 0.00000024,
            0.00000024, 0.00000024, 0.00000023,
            0.00000024, 0.00000024, 0.00000024,
            0.00000024, 0.00000023, 0.00000024,
            0.00000023, 0.00000024, 0.00000023,
            0.00000024, 0.00000024, 0.00000023,
            0.00000023, 0.00000023], dtype='float64')
          result = func.RSI(a, 10)
          assert_array_equal(result, [np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,0,0,0,0,0,0,0,0,0,0])
          result = func.RSI(a * 100000, 10)
  >       assert_array_equal(result, [np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,np.nan,33.333333333333329,51.351351351351347,39.491916859122398,51.84807024709005,42.25953803191981,52.101824405061215,52.101824405061215,43.043664867691085,43.043664867691085,43.043664867691085])
  E       AssertionError: 
  E       Arrays are not equal
  E       
  E       Mismatched elements: 6 / 20 (30%)
  E       Max absolute difference: 7.10542736e-15
  E       Max relative difference: 2.13162821e-16
  E        x: array([      nan,       nan,       nan,       nan,       nan,       nan,
  E                    nan,       nan,       nan,       nan, 33.333333, 51.351351,
  E              39.491917, 51.84807 , 42.259538, 52.101824, 52.101824, 43.043665,
  E              43.043665, 43.043665])
  E        y: array([      nan,       nan,       nan,       nan,       nan,       nan,
  E                    nan,       nan,       nan,       nan, 33.333333, 51.351351,
  E              39.491917, 51.84807 , 42.259538, 52.101824, 52.101824, 43.043665,
  E              43.043665, 43.043665])

  /project/tests/test_func.py:162: AssertionError
  =========================== short test summary info ============================
  FAILED ../project/tests/test_func.py::test_BBANDS - assert 0.0010000000000047...
  FAILED ../project/tests/test_func.py::test_RSI - AssertionError: 
  =================== 2 failed, 38 passed, 1 skipped in 0.56s ====================
mrjbq7 commented 2 years ago

Floats are fun. We can probably fix up the tests to ignore these small differences.

Where are we at with merging this? Is that the only issue?

TkTech commented 2 years ago

Pretty much. I can finish this on the weekend if the 'fix' to this is to increase the +/- on the equality comparison.

mrjbq7 commented 2 years ago

You could probably use assert_array_almost_equal

TkTech commented 2 years ago

@mrjbq7 This is 90% of the way there, and just needs a few more fixes.

This PR now bundles ta-lib (actually a git subtree from https://github.com/TkTech/ta-lib-src) which is unmodified except for updates for configure's database so it can compile on PineBook pros and other newer architectures. This means that all the installations steps in the readme are just poof, gone. Just do a python setup.py install anywhere with a C compiler ta-lib will work without any configuration.

The release action for building binary wheels from 3.8 to 3.10 is done and working, https://github.com/TkTech/ta-lib/actions/runs/2357646627. I've commented out support for ppc64le and aarch64 in release.yml, because while it works, it takes 4+ hours to build due to needing to build numpy, pandas, and other native dependencies under emulation. See the numpy issue linked in the release.yml for when numpy will provide binary wheels for these architectures, and then this can be uncommented. That said, installing ta-lib with pip works on these platforms, and when running on a real machine usually only takes a couple of minutes so I'm not super concerned with excluding them.

I've removed py3.7 from the workflows as both numpy and pandas have dropped support for these. There's no version of pandas that supports both 3.7 and 3.10, so you have to chose. That said, someone else can totally add support for more versions, they just need to add another build step to install older versions of numpy+pandas for those versions of Python.

What's left for this is:

  1. Now that we're done testing, re-add the actual upload step to PyPi to release.yml.
  2. The regular test.yml is getting a weird error where it can't import the C extension. This has been difficult to debug since it only occurs on the Github Action.
mrjbq7 commented 2 years ago

That seems great, I wonder if we can apply those fixes to https://github.com/ta-lib/ta-lib and make that our official source.

It's possible the current directory is wrong for the test.yml, i haven't looked into it specifically but if you try and import ta-lib from inside the ta-lib source directory but have not done a build in-place, then it will look at the current directory version of ta-lib not the one that was installed and built due to shadowing.

TkTech commented 2 years ago

That seems great, I wonder if we can apply those fixes to https://github.com/ta-lib/ta-lib and make that our official source.

I'd like if there was one spot, but that repo definitely seemed to be abandoned. PRs and issues have been ignored for years.

It's possible the current directory is wrong for the test.yml, i haven't looked into it specifically but if you try and import ta-lib from inside the ta-lib source directory but have not done a build in-place, then it will look at the current directory version of ta-lib not the one that was installed and built due to shadowing.

AHHHH, you know what, I bet you this is exactly what's happening! pip just changed the behavior for build-in-place too and we update pip as the first step.

mrjbq7 commented 2 years ago

I'd like if there was one spot, but that repo definitely seemed to be abandoned. PRs and issues have been ignored for years.

We made this as a clone of the source forge repo, there was some idea of taking over maintenance, but that seemed overly ambitious.

TkTech commented 2 years ago

I'd like if there was one spot, but that repo definitely seemed to be abandoned. PRs and issues have been ignored for years.

We made this as a clone of the source forge repo, there was some idea of taking over maintenance, but that seemed overly ambitious.

Ah I didn't see you on the contributor list when I took a peek. If you have commit access to it that's great, we can definitely use it instead.

xmatthias commented 2 years ago

any update on this? it'd be great to have ta-lib release binaries (mainly for windows) - especially since the prior source (https://www.lfd.uci.edu/~gohlke/pythonlibs/#ta-lib) seems to have been archived - and didn't update since June.

This makes every ta-lib update a pain for users - since installing on windows is currently really not fun.

xmatthias commented 2 years ago

I've taken the liberty to play with this a bit, updating it with the current master, and testing the wheels.

The CI runs on current master are available here: tests: https://github.com/xmatthias/ta-lib/actions/runs/3027067796 build: https://github.com/xmatthias/ta-lib/actions/runs/3027067795

As mentioned above already - the "release" step has to be updated to only run on tags - and an "upload" step that publishes the wheels to pypi (if that's something you're interested in in the first step - this could also be done manually initially).

I've manually tested the wheels on windows and linux, and can confirm they work fine.

I've added a "test wheels" step to CI too - which is using a clean environment, only installing the wheels - to ensure they're working on windows, mac and linux. This will ensure that wheels about to be uploaded are actually built properly (i don't fully trust wheel building - therefore prefer to test them individually) - and is basically taking the approach used in pytables CI as well.

The "manual test" step here involves calling python -m talib.abstract. It'll fail if ta-lib binaries are not properly installed / available. If there's a better test command we can and should obviously update to that - i'd however not run pytest - as test-files are not included in the release.

@mrjbq7 i'd make a PR out of that if you're interested - it'll include the work of this PR - but with the intend to have it merged ASAP - simplifying people's update and installation steps.

mrjbq7 commented 2 years ago

I love this idea, not sure about using this for releases, until I understand how to only have it release tags, and also how to download and test the intermediate artifacts myself?

For test case, you could do something simple like make a 6 element numpy array and run SMA on it and assert the result, and probably do it with python -c

On Sep 10, 2022, at 12:11 AM, Matthias @.***> wrote:

 I've taken the liberty to play with this a bit, updating it with the current master, and testing the wheels.

The CI runs on current master are available here: tests: https://github.com/xmatthias/ta-lib/actions/runs/3027067796 build: https://github.com/xmatthias/ta-lib/actions/runs/3027067795

As mentioned above already - the "release" step has to be updated to only run on tags - and an "upload" step that publishes the wheels to pypi (if that's something you're interested in in the first step - this could also be done manually initially).

I've manually tested the wheels on windows and linux, and can confirm they work fine.

I've added a "test wheels" step to CI too - which is using a clean environment, only installing the wheels - to ensure they're working on windows, mac and linux. This will ensure that wheels about to be uploaded are actually built properly (i don't fully trust wheel building - therefore prefer to test them individually) - and is basically taking the approach used in pytables CI as well.

The "manual test" step here involves calling python -m talib.abstract. It'll fail if ta-lib binaries are not properly installed / available. If there's a better test command we can and should obviously update to that - i'd however not run pytest - as test-files are not included in the release.

@mrjbq7 i'd make a PR out of that if you're interested - it'll include the work of this PR - but with the intend to have it merged ASAP - simplifying people's update and installation steps.

— Reply to this email directly, view it on GitHub, or unsubscribe. You are receiving this because you were mentioned.

xmatthias commented 2 years ago

I love this idea, not sure about using this for releases, until I understand how to only have it release tags, and also how to download and test the intermediate artifacts myself?

You can download the artifacts from the build all the way at the bottom (see screenshot).

image

it's a zip file, which contains the source tarball (what you release currently) - as well as all wheels for different plattforms. The best way to install these is by using pip - with pip install --no-index --find-links <unzippeddir> TA-Lib. (please note that the command assumes your env/virtualenv already has dependencies (mainly numpy) installed - as it specifies --no-index to avoid accidental installation from the package index).

not sure about using this for releases, until I understand how to only have it release tags

that's no problem, really - you'll always have the option to download the wheels first - inspect/test them - and then push them to pypi manually (basically with twine upload <wheeldir>. The benefit comes from having all wheels available (also for platforms you don't directly own).

i think users will benefit from this by having the installation simplified with pip install ta-lib(no additional steps prior to that) - which will no longer depend on having the C library installed upfront - as well as reduced support - as i noticed many issues opened here are around "failed to install for whatever reason" ...

TkTech commented 2 years ago

I've added a "test wheels" step to CI too - which is using a clean environment, only installing the wheels - to ensure they're working on windows, mac and linux.

If you're basing your changes off of mine, that's already what cibuildwheel is doing. It'll build the wheels, copy them into a new env, and run the tests before allowing it to pass.

xmatthias commented 2 years ago

If you're basing your changes off of mine, that's already what cibuildwheel is doing. It'll build the wheels, copy them into a new env, and run the tests before allowing it to pass.

i am aware of that - but i don't fully trust that for libraries like ta-lib (Or pytables, for that matter).

the wheels would also work if ta-lib is installed system-wide - reusing the library from some place like /usr/lib or similar. So if the build method doesn't install the C library in the environment, but in the system (someplace), they'd be available for the tests - while a "clean" installation on a fresh system would fail.

mrjbq7 commented 2 years ago

Do we want to bundle upstream or download from upstream directly, e.g., ta-lib.org?

Can we get a version of this that does everything but doesn't upload releases, and get that merged?

TkTech commented 2 years ago

You need to bundle upstream - the reason I include a git subtree of my fork of ta-lib is for platform compatibility, and ta-lib.org will never see updates. The fork will build on a Pinebook Pro for example, whereas ta-lib.org will not.

mrjbq7 commented 2 years ago

Want to collaborate in the ta-lib/ta-lib GitHub org? We can push those patches there, and maybe move this to ta-lib/ta-lib-python?On Sep 13, 2022, at 8:04 AM, Tyler Kennedy @.***> wrote: You need to bundle upstream - the reason I include a git subtree of my fork of ta-lib is for platform compatibility, and ta-lib.org will never see updates. The fork will build on a Pinebook Pro for example, whereas ta-lib.org will not.

—Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.***>

TkTech commented 2 years ago

Want to collaborate in the ta-lib/ta-lib GitHub org? We can push those patches there, and maybe move this to ta-lib/ta-lib-python?On Sep 13, 2022, at 8:04 AM, Tyler Kennedy @.> wrote: You need to bundle upstream - the reason I include a git subtree of my fork of ta-lib is for platform compatibility, and ta-lib.org will never see updates. The fork will build on a Pinebook Pro for example, whereas ta-lib.org will not. —Reply to this email directly, view it on GitHub, or unsubscribe.You are receiving this because you were mentioned.Message ID: @.>

Sure, I have no issue with this. I think we discussed this before and the reason I forked was that the repo seemed inactive.

the wheels would also work if ta-lib is installed system-wide - reusing the library from some place like /usr/lib or similar. So if the build method doesn't install the C library in the environment, but in the system (someplace), they'd be available for the tests - while a "clean" installation on a fresh system would fail.

ta-lib should never be used from the system now, we explicitly bundle it. If there's a place where we're accidentally allowing a system ta-lib to get linked in we've got a bug. There are fixes for 32bit/64bit precision and platform support that the archaic system package manager versions will not have.

xmatthias commented 2 years ago

@TkTech What's the actual source for the "inlined", upstream ta-lib repository?

comparing it to https://github.com/TA-Lib/ta-lib (basically did a clone - and copied the contents of upstream over the clone) - i do see quite severe differences. starting with the copyright being from 2007 instead of 2008 - which suggests it's missing at least commit 5a833ad57bbedb5fdd3c6fa62c7c25b36e9cf94e (a commit made mid 2008) - and potentially everything that went in on top of that.

i've now tried to use the upstream one (branch master) - the diff this caused can be found here: https://github.com/xmatthias/ta-lib/commit/8353ef7016cc6ac83f16b91023ffdd31c369e432 (it's currently not building as i reverted all changes). seems like the upstream version has more indicators - for example IMI - which is a file that was absent from the inlined source in this PR.

edit seems like there's actually been a 0.5.0 release of TA-Lib - and a 0.6.0 dev (which represents the emaster branch). it's a decision point which version should be bundled ...

TkTech commented 1 year ago

@xmatthias sorry, missed your reply. The upstream version I used is the official ta-lib release from https://ta-lib.org/. The version on github is an "unofficial" (but defacto official at this point) version by @mrjbq7 and friends (I believe)

trufanov-nok commented 1 year ago

Official https://ta-lib.org/ conatins a realease of 0.4.0 and official https://ta-lib.org/ 's project at sourceforge contains unreleased 0.6.0-dev. It's not about github or unofficial forks in any case.

Didn't read the whole thread - just leave here a note from my old findings from my old article about compatibility:

Deviations of TA-Lib 0.6.0-dev from TA-Lib 0.4.0 0.6.0-dev is more sensitive to the small values. TA-lib is using TA_IS_ZERO() macro to compare floating point integers to zero. In 0.4.0 it assumes all integers which absolute value is smaller than 1e-8 to be zero. Since this commit the threshold was decreased to 1e-14. Dozen of technical indicators in TA-Lib are making decisions based on TA_IS_ZERO() result. Thus you may face with discrepancies in results on TA-Lib 0.4.0 and TA-Lib 0.6.0 or TA-Lib RT for such indicators. For example, if your input data contains long enough periods where values almost not changes.

Thet's a most important (may be only one - don't remember) difference in my opinion.

xmatthias commented 1 year ago

i've (successfully) built wheels for all versions (0.5.0, 0.6.0-dev, 0.4.0 <- all from github, 0.4.0 from orig source)

i also agree with @trufanov-nok that the floating point fix should be included. The project i maintain has a "install script" for ta-lib which applies this fix for linux installations to avoid problems. It's been included in the unofficial windows wheels (linked on the readme), too - which can be tested with the new test-case (linked below)

a rough overview of this change can be found in this commit: https://github.com/xmatthias/ta-lib/commit/6452708550cdb30c1c28b30319dce2e244d89cbe (which also adds a testcase ensuring this fix is included).

I think which version should be used initially is a decision @mrjbq7 will have to take (please let us know your thoughts).

Personally, i'd recommend to go with a 0.4.0 version initially, with the float fix applied (don't care if it's the github one or the ta-lib.org one, really). That way it'll be aligned to current behavior / install instructions as much as possible.

Future updates to 0.5.0 or 0.6.0 can then be made as future steps.