twopirllc / pandas-ta

Technical Analysis Indicators - Pandas TA is an easy to use Python 3 Pandas Extension with 150+ Indicators
https://twopirllc.github.io/pandas-ta/
MIT License
5.32k stars 1.04k forks source link

GPU integration with cuDF #375

Closed krazykoder closed 3 years ago

krazykoder commented 3 years ago

Hello @twopirllc I have been an ardent user of pandas-ta library for over 2 years, and i really appreciate your work and the work of all open source contributors to this project.

Recently I have been trying to write some computation code on GPU for ranking and sorting operation using cuda Dataframes. https://github.com/rapidsai/cudf/

Seems like a lot of compute can be accelerated using GPU. However this cuDF NVIDIA library does not have all low level math functions implemented like in pandas; for example moving averages etc etc.

Nvidia has a gQuant plugin library with some cuda finance indicators that i was trying to use. https://github.com/NVIDIA/fsi-samples/tree/e3a5e46a029789e340daa552ffe2adf55d60b547/gQuant/plugins/gquant_plugin/greenflow_gquant_plugin/cuindicator

I tried to port pandas-ta to adapt to GPU using this but seem like I dont have enough skills to rewrite the library. Is there any future plan to implement a GPU version of pandas-ta where applicable. I understand not all functions or mathematical operations can be accelerated.

Appreciate your comments. -towshif

twopirllc commented 3 years ago

Hello @krazykoder,

I have been an ardent user of pandas-ta library for over 2 years, and i really appreciate your work and the work of all open source contributors to this project.

Great! Thanks!

I tried to port pandas-ta to adapt to GPU using this but seem like I dont have enough skills to rewrite the library.

Yes, it is harder than it appears. Especially with all the variations of an indicator depending on the platform or library they use. Users demand speed and consistency and that has and will be a frequent software management issue for a Technical Analysis Library; like I am experiencing now.

Is there any future plan to implement a GPU version of pandas-ta where applicable.

Not likely in the near future. 😞

As discussed with xmatthias from freqtrade in comments https://github.com/twopirllc/pandas-ta/issues/367#issuecomment-901361995, there could possibly be a host of issues and that can be rather difficult for a maintainer to resolve. However one can make an independent library that uses Pandas TA as a fallback as Matthias recommends. But in order for me to dedicate serious time to do start another independent TA library would require some financial support to warrant exceeding the free time I barely have already to build and maintain Pandas TA.

Apologies if this is not what you hoped to hear.

Kind Regards, KJ

rahimkhoja commented 4 months ago

I started on this.. It's not done, and I won't have much time to work on it. I got GroqCloud to do much of the conversion. Many Many of the files have been successfully converted , but others need some work. You need to look in the 'Enhanced-Test' branch . The bigger files are the problem. It probably won't be too hard to use GroqCloud on those files individually to update it.

https://github.com/ImprobabilityLabs/pandas-ta-cudf/tree/enhanced-test ( I'm thinking it should be called 'CuDF-TA' )

This is the tool I made to do the conversion. ( You need to have git setup with ssh access on the machine you use it on ) https://github.com/ImprobabilityLabs/repo-enhance

GroqCloud ( Llama 3 - 70B ) is free right now. FYI

This is only going to be useful for datasets with millions of rows. To test it make a Linux box with a 1080Ti or newer. Should work with Cuda well enough to use CuDF

@twopirllc I would be happy to move this in to your repo. Perhaps have you all look it over when I get farther along.

krazykoder commented 2 months ago

@rahimkhoja did you see performance boost in general - I did some investigation on this back in 2022 after I posted my request ; my understanding was that GPU workload could make calculation faster overall for a huge >1M data series points; however data copy overhead to GPU won't be worth for smaller datasets which is typically say 5000 to 10000 time series points; a CPU only C/C++ or rust implementations and having a port to python with this library only for slow algos as in the benchmark would be worth trying is my guess. But I did read some materials that if the data needs be grouped, segmented or reorganized a GPU data frame would be lightning fast for realtime analysis use cases.