vcerqueira / performance_estimation

Performance estimation for time series forecasting tasks
36 stars 6 forks source link

Missing data #1

Open gu-stat opened 3 years ago

gu-stat commented 3 years ago

Hello,

I could not find some of the datasets that you load in your code.

In particular, I'm looking at your synthetic datasets (I didn't check if there were issues with the RWTS).

In result-analysis-synthetic.r, you have:

load("final_results_synthetic_ts1_lasso.rdata")
load("final_results_synthetic_ts2_lasso.rdata")
load("final_results_synthetic_ts3_lasso.rdata")
load("final_results_synthetic_ts1_rbr.rdata")
load("final_results_synthetic_ts2_rbr.rdata")
load("final_results_synthetic_ts3_rbr.rdata")
load("final_results_synthetic_ts1_rf.rdata")
load("final_results_synthetic_ts2_rf.rdata")
load("final_results_synthetic_ts3_rf.rdata")

Out of those, I could only find the code for the creation on the last one ("final_results_synthetic_ts3_rbr.rdata"). The code that creates this dataset is the perfestimation-synthetic.r.

Using this file, if I want to obtain the remaining datasets (for instance, those related to TS1), should I just change embedded_time_series <- synthetic$TS3 to embedded_time_series <- synthetic$TS1 and the argument for predictive_algorithm inside the function workflow ("rf", "rbr", and "lasso")?

Thanks.

vcerqueira commented 3 years ago

Hi gu-stat,

Thanks for bringing this to my attention. I've added a folder named synthetic_data_generation, which contains the scripts necessary to create all three synthetic scenarios. Beware that the process is random. Unfortunately I did not set up a seed.

Does this solve your problem?

vcerqueira commented 3 years ago

Hi gu-stat,

Thanks for bringing this to my attention. I've added a folder named synthetic_data_generation, which contains the scripts necessary to create all three synthetic scenarios. Beware that the process is random. Unfortunately I did not set up a seed.

Does this solve your problem?

gu-stat commented 3 years ago

Hello,

No problem. Thank you for your quick reply!

However, the two files that you added, creating-synthetic-ds.r and simulate-ts.r are pretty much the same ones already inside the original 'synthetic' folder (I say pretty much, because the 'new' creating-synthetic-ds.r does not have the code that creates plots, which the 'old' one has).

The "problem" seems to be with the file perfestimation-synthetic.r. I think that this file might be incomplete, and it does not generate "final_resultssynthetic\ts1_rf.rdata" and "final_resultssynthetic\ts2_rf.rdata", but only "final_results_synthetic_ts3_rf.rdata"

More specifically, in perfestimation-synthetic.r. you use load("data/synthetic.rdata") (which has TS1, TS2, and TS3 - and was generated from creating-synthetic-ds.r to create save(final_results, file = "final_results_synthetic_ts3_rf.rdata"), but the same is not done for the first two synthetic cases (TS1, TS2) (i.e., there's no save(final_results, file = "final_results_synthetic\_**ts1**\_rf.rdata").

The only call to synthetic$TS1 that I could find was in plot_df(synthetic$TS1[[180]]$target) - inside the 'old' creating-synthetic-ds.r . Thus, I could not find the code that creates "final_resultssyntheticts1_rf.rdata"

My guess is that to get this rdata, I'd only need to change embedded_time_series <- synthetic$TS3 inside perfestimation-synthetic.r. to embedded_time_series <- synthetic$TS1, then create the final_results object, and run save(final_results, file = "final_results_synthetic_ts1_rf.rdata").

If I want to get the rdata "final_results_syntheticts3rbr.rdata", I'd have to substitute the argument predictive_algorithm = "rf" inside the workflow function (still using the same file, perfestimation-synthetic.r.) to predictive_algorithm = "rbr". Then, the object final_results would have the errors using TS3 and RBR_loss.

Is that correct?

Thank you very much.

vcerqueira commented 3 years ago

Hi gu-stat,

You are absolutely right. I should have done that programatically with a loop but I got lazy.

Thank you for noticing this. I will leave a comment regarding this and update the code when I can.

Cheers