zhaokg / Rbeast

Bayesian Change-Point Detection and Time Series Decomposition
208 stars 36 forks source link

Convergence of the results. #19

Open aleslimacastro opened 9 months ago

aleslimacastro commented 9 months ago

It is known that in a system based on Monte Carlo sampling, the results depend on the initial random seed. To my understanding, the RBeast package has this characteristic. However, the results are varying significantly, even when increasing the number of samples to 32,000, for example. I would greatly appreciate your help in finding a way to adjust the parameters so that the results converge. I am running the program on a computer with an i7 processor, 8GB of memory, and the Linux Ubuntu 22.04 operating system, using Python or Matlab environments. I am applying RBeast to remove the seasonality from weekly time series. This series has 978 observations, and I used the following parameters: beast(series, start=2005.01644, deltat=1/52, period=1, mcmc_samples=32000). Thank you for your attention

dirt commented 9 months ago

Thanks a lot sharing. Given the same computation, it is better to use more chains than longer chains. Can you play around with the mcmc_chains parameter to see if the result improves? No convergence is not a bad thing, which says something about the inconsistency beteeen data and the model. I am writing from my cell phone and will explain more later . Meanwhile, if you share your data with me, I can help take a look and see where we can improve the fitting. Thanks again.

Get Outlook for iOShttps://aka.ms/o0ukef


From: Alessandro L Castro @.> Sent: Saturday, September 30, 2023 11:24:08 AM To: zhaokg/Rbeast @.> Cc: Subscribed @.***> Subject: [zhaokg/Rbeast] Convergence of the results. (Issue #19)

It is known that in a system based on Monte Carlo sampling, the results depend on the initial random seed. To my understanding, the RBeast package has this characteristic. However, the results are varying significantly, even when increasing

It is known that in a system based on Monte Carlo sampling, the results depend on the initial random seed. To my understanding, the RBeast package has this characteristic. However, the results are varying significantly, even when increasing the number of samples to 32,000, for example. I would greatly appreciate your help in finding a way to adjust the parameters so that the results converge. I am running the program on a computer with an i7 processor, 8GB of memory, and the Linux Ubuntu 22.04 operating system, using Python or Matlab environments. I am applying RBeast to remove the seasonality from weekly time series. This series has 978 observations, and I used the following parameters: beast(series, start=2005.01644, deltat=1/52, period=1, mcmc_samples=32000). Thank you for your attention

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https://github.com/zhaokg/Rbeast/issues/19__;!!KGKeukY!x8EptE1bs2LzkWZDlBWbXDeoenYzFcg7wyNLUTa98aFcAMaQR_pdNgoflEiC7-D6h7LLI5OGC0uiCf7qaeN6buJda8cZ$, or unsubscribehttps://urldefense.com/v3/__https://github.com/notifications/unsubscribe-auth/AMCAGWWRWKOVNAFUHIFCHELX5A2RRANCNFSM6AAAAAA5NT6HRQ__;!!KGKeukY!x8EptE1bs2LzkWZDlBWbXDeoenYzFcg7wyNLUTa98aFcAMaQR_pdNgoflEiC7-D6h7LLI5OGC0uiCf7qaeN6btCp2hm_$. You are receiving this because you are subscribed to this thread.Message ID: @.***>

aleslimacastro commented 9 months ago

Thank you for the prompt response. Here is the link to the data https://drive.google.com/file/d/12PGyJwWV6zNSzb_nWOmVEOC0Jy6aR_0N/view?usp=sharing This dataset features time series observations of the marginal operation cost for one of the subsystems in the Brazilian electricity market, with data organized on a weekly basis. In the CSV file, the first column represents the dates in the "%Y-%m-%d" format, and the second column represents the values. I performed a logarithmic transformation on the data, after adding a constant, due to the presence of null values in the dataset. I haven't explored the other parameters of the Markov chain yet but intend to conduct some tests.

dirt commented 9 months ago

Thanks a lot. May I ask for a further clarification? Are the zeros some true values (i.e., zero margin operation cost ) or some missing values (e.g., NaN)?

Kaiguang

From: Alessandro L Castro @.> Sent: Saturday, September 30, 2023 2:22 PM To: zhaokg/Rbeast @.> Cc: Zhao, Kaiguang @.>; Comment @.> Subject: Re: [zhaokg/Rbeast] Convergence of the results. (Issue #19)

Thank you for the prompt response. Here is the link to the data https: //drive. google. com/file/d/12PGyJwWV6zNSzb_nWOmVEOC0Jy6aR_0N/view?usp=sharing This dataset features time series observations of the marginal operation cost for one of the

Thank you for the prompt response. Here is the link to the data https://drive.google.com/file/d/12PGyJwWV6zNSzb_nWOmVEOC0Jy6aR_0N/view?usp=sharinghttps://urldefense.com/v3/__https:/drive.google.com/file/d/12PGyJwWV6zNSzb_nWOmVEOC0Jy6aR_0N/view?usp=sharing__;!!KGKeukY!0rbz_guVnXpi5h8mPod0BrXjozuCZXYI3zCVZw3LLkhkHqA-5l2V5CQqCQTapHmvTiraMnMEn6GEaOGL7yrdOvXDXSpf$ This dataset features time series observations of the marginal operation cost for one of the subsystems in the Brazilian electricity market, with data organized on a weekly basis. In the CSV file, the first column represents the dates in the "%Y-%m-%d" format, and the second column represents the values. I performed a logarithmic transformation on the data, after adding a constant, due to the presence of null values in the dataset. I haven't explored the other parameters of the Markov chain yet but intend to conduct some tests.

— Reply to this email directly, view it on GitHubhttps://urldefense.com/v3/__https:/github.com/zhaokg/Rbeast/issues/19*issuecomment-1741830016__;Iw!!KGKeukY!0rbz_guVnXpi5h8mPod0BrXjozuCZXYI3zCVZw3LLkhkHqA-5l2V5CQqCQTapHmvTiraMnMEn6GEaOGL7yrdOmuXTFrk$, or unsubscribehttps://urldefense.com/v3/__https:/github.com/notifications/unsubscribe-auth/AMCAGWXHLJPNFCFDP5AITU3X5BPJFANCNFSM6AAAAAA5NT6HRQ__;!!KGKeukY!0rbz_guVnXpi5h8mPod0BrXjozuCZXYI3zCVZw3LLkhkHqA-5l2V5CQqCQTapHmvTiraMnMEn6GEaOGL7yrdOkCzat-T$. You are receiving this because you commented.Message ID: @.**@.>>

aleslimacastro commented 9 months ago

Hi Kaiguang,

The zeros within the time series are true values; there are no missing values in this series.

Alessandro