zhaokg / Rbeast

Bayesian Change-Point Detection and Time Series Decomposition
208 stars 36 forks source link

Segmemntation fault(Core dumped) error for samples sized over 1000k #6

Open SH20025133 opened 1 year ago

SH20025133 commented 1 year ago

The Python version of the package (also the R) works fine with smaller sample sizes till 10k and even for some 100k values, but as I keep on increasing the sample size, the above mentioned error becomes more and more prominent to a point, the code doesn't execute at all.

zhaokg commented 1 year ago

Hi there, You definitely caught me. And the segmentation fault is expected: When I wrote the code, it was implemented with an intention for not too long time series. Typical sample sizes handled in the applications I am familiar with are less than a few thousands. The rationale is that the use of a Bayesian algorithm for a super long time series should be very time-consuming. (I guess it may take the beast algorithm hours to segment and decompose a time series of 100 k in length). With that said, in my code, I used some small-type integer variables (e.g., 8-bit integers or 16-bit integers such as int8_t, or int16_t) for some parts. I expect these small-type integers will be overfloated and give random segmentation errors when time series are long.

If you see some potential values of the algorithm and also are able to share some sample data, I am happy to test-run it for you and if needed, change and recompile the code to accommodate large sample sizes. Hopefully , this clarifies.

SH20025133 commented 1 year ago

Thank you for the update. To my surprise BEAST is working quite fast even with large sample sizes like 1000k (infact the time taken by it is comparable to non-Bayesian approaches like PELT). I have system trace that marks timestamps for requests which I'd like to do changepoint analysis(it has roughly a size of 3600k).

I can send you the file over email if required for updating the code.

zhaokg commented 1 year ago

Yes, can you please send me the file over email at zhao.1423@osu.edu? Thanks a lot

SH20025133 commented 1 year ago

I have reached out to you over email with the sample dataset.