TDAmeritrade / stumpy

STUMPY is a powerful and scalable Python library for modern time series analysis
https://stumpy.readthedocs.io/en/latest/
Other
3.58k stars 315 forks source link

Add GPU-MSTUMP: Multi-dimensional STUMP on GPUs #90

Open siskos32 opened 4 years ago

siskos32 commented 4 years ago

Is there a multi-dimensional time series data analysis using GPU instead Dask Distributed MSTUMPED?

seanlaw commented 4 years ago

Currently, that feature is not available as there are significant differences between how the code is implemented for CPUs vs GPUs. Our goal was to get one-dimensional GPU-STUMP working first and then to learn from that experience. Next, we'd explore implementing parallel GPU-STUMP (i.e., multiple GPUs on the same server but for one-dimensional data). And then, finally, we'd consider implementing GPU-MSTUMP(ED). Unfortunately, moving things over from CPU to GPUs is non-trivial and we are trying to keep up the readability/maintainability of the code at the cost of reduced performance whenever possible. We certainly welcome any PRs (with help and guidance)!

Out of curiosity:

  1. What your use case is?
  2. What is your data size (i.e., how many dimensions and how many data points are there for each time series)?
  3. Have you already tried MSTUMP or MSTUMPED already?

I will certainly think about this as this certainly within the scope

seanlaw commented 4 years ago

There is an mSTOMP-GPU implementation that we might be able to learn from

seanlaw commented 2 years ago

After comparing the MSTUMP code with GPU_STUMP, I think there is a path forward. Specifically, in _mstump we have:

https://github.com/TDAmeritrade/stumpy/blob/955ab965c62d93c98cef4136f1a684529b747f01/stumpy/mstump.py#L809-L826

So, for GPU_MSTUMP, we'll need to replace all of the functions used in this section with their equivalent GPU-based versions so that all computations happen on the GPU. However, we will need to synchronize via the CPU and this is handled by the outermost for-loop.

seanlaw commented 2 years ago

We may be able to learn some things from this recent paper explores MSTOMP on GPUs:

Exploiting_Reduced_Precision_for_GPU-based_Time_Series_Mining-2.pdf