janezd / baycomp

MIT License
69 stars 15 forks source link

Correct comparison method - need advice #15

Closed Arturus closed 1 year ago

Arturus commented 1 year ago

Hello, Could you please recommend a right comparison method for my problem? I have N timeseries and predict K (usually K=4) last observations for each timeseries during a cross-validation (one predicted observation per fold). Specifics: a) this is timeseries-related walk-forward validation more similar to Leave-One-Out; b) this is regression problem. At the end, I have K*N scores. Each timeseries has different magnitude of forecasting errors/scores due to different amount of noise in the data.

Which comparison method should I use? What comes to mind:

  1. Treat all timeseries as a single dataset and use two_on_single() with vectors of K*N length and runs=1 (or runs=K?)
  2. Use two_on_multiple() with vectors of length N, each item in vector is average of K folds
  3. Use two_on_multiple() in hierarchical mode and pass matrices of (N,K) size and runs=1

#1 seems to be a bad choice due to different magnitude of scores between series (the resulting distribution of scores is heavy tailed), #3 seems to be optimal but slow, #2 is much faster but less precise alternative to #3. Are my conclusions correct?

gcorani commented 1 year ago

Hi Arthur, I would compute the average on each time series and then compare the two (K x 1) vectors via the Bayesian signed-rank test,; this is your option #2.

best, Giortgio

On Sun, Dec 18, 2022 at 8:17 PM Arthur Suilin @.***> wrote:

Hello, Could you please recommend a right comparison method for my problem? I have N timeseries and predict K (usually K=4) last observations for each timeseries during a cross-validation (one predicted observation per fold). Specifics: a) this is timeseries-related walk-forward validation more similar to Leave-One-Out; b) this is regression problem. At the end, I have K*N scores. Each timeseries has different magnitude of forecasting errors/scores due to different amount of noise in the data.

Which comparison method should I use? What comes to mind:

  1. Treat all timeseries as a single dataset and use two_on_single() with vectors of K*N length and runs=1 (or runs=K?)
  2. Use two_on_multiple() with vectors of length N, each item in vector is average of K folds
  3. Use two_on_multiple() in hierarchical mode and pass matrices of (N,K) size and runs=1

1 seems to be a bad choice due to different magnitude of scores between

series (the resulting distribution of scores is heavy tailed), #3 seems to be optimal but slow, #2 is much faster but less precise alternative to

3. Are my conclusions correct?

— Reply to this email directly, view it on GitHub https://github.com/janezd/baycomp/issues/15, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEODHRQ2KEDK4Q62275A23DWN5PNBANCNFSM6AAAAAATCVN5VE . You are receiving this because you are subscribed to this thread.Message ID: @.***>

Arturus commented 1 year ago

Gcorani, thank you. Did you mean two (N x 1) vectors (N is number of time series)?

gcorani commented 1 year ago

exactly

On Sun, Dec 18, 2022 at 9:21 PM Arthur Suilin @.***> wrote:

Gcorani, thank you. Did you mean two (N x 1) vectors (N is number of time series)?

— Reply to this email directly, view it on GitHub https://github.com/janezd/baycomp/issues/15#issuecomment-1356868255, or unsubscribe https://github.com/notifications/unsubscribe-auth/AEODHRT3DNXWDT3ZYJ5ZNZLWN5W3PANCNFSM6AAAAAATCVN5VE . You are receiving this because you commented.Message ID: @.***>

Arturus commented 1 year ago

Ok, thank you again!