Closed calvinmccarter closed 1 week ago
Hello @calvinmccarter, thank you for your question.
Yes you are exactly right! For all experiments, we factorized the power of tens out of the metrics for readability between rows (0.644 --> 6.44). We have edited the paper tables to make that clearer:
Let us know if we can help with anything else!
Thanks so much for the help @ANazaret ! I'm now trying to replicate the "Treeffuser (no tuning)" result (see my new gist). I'm getting an even better CRPS ({'crps_100': 0.6524468580538034}), the same MAE ('mae': 0.9883351454782684), but worse RMSE ('rmse': 2.1792834439725737). In case this helps with the discrepancy, in cell 25 of this gist, I'm using the following hyperparams, based on Appendix C:
{"n_estimators": 3000, "learning_rate": 0.1, "num_leaves": 32, "early_stopping_rounds": 50, "n_repeats": 10}
Thank you for your question, @calvinmccarter!
It looks like the hyperparameters that we used for Treeffuser (no tuning) weren't updated in the paper. You can find the correct hyperparameters under testbed/src/testbed/models/treeffuser.py:
{"n_estimators": 3000, "learning_rate": 0.1, "num_leaves": 31, "early_stopping_rounds": 50, "n_repeats": 30}
We've updated the paper to reflect this—thanks for helping us catch the typo!
Thanks for looking into this, @aagrande -- mystery resolved!
Thanks for providing usable software, and for the notebooks you provided. I'm trying to replicate your M5 (Walmart) experiments , and am having difficulty. I do see the following (see gist):
Does this correspond to the reported Treeffuser CRPS of 6.44 in Table 2?