Closed tyler-tomita closed 8 years ago
Revised here
Dear Reviewers,
We graciously thank you for your feedback and criticisms. We have carefully taken your comments into consideration and would like to respond to some of the major concerns.
In our view, our contribution is two-fold. First, it is a re-analysis and re-interpretation of oblique decision forests, including for example Breiman's Forest-RC (FRC). Second, by virtue of this improved perspective, we provide a number of novel additional advancements. We provide more details on each of the above two points below.
When Breiman introduced FRC in his seminal paper, he concluded: "Overall, it compares more favorably to Adaboost than Forest-RI (FRI)." And yet, it has been his FRI (the axis-aligned) counterpart, that has been lauded. In particular, two recent studies (Delgado 2014; Caruana 2008) that found FRI to be the overall best performing classification method among a variety of other methods on a variety of benchmark datasets did not include FRC in the comparisons. We conjecture that one of the main reasons for people to focus on FRI rather than FRC is because FRC has an additional hyperparameter to tune, which makes FRC computational several fold less tractable than FRI. We therefore formulated a variant with similar performance to FRC, but with only 1 parameter as in FRI, therefore achieving the best of both worlds.
We have since conducted extensive experiments that demonstrate that indeed, our RerF and FRC have similar performance properties, though RerF is several-fold faster to tune. We will include both these accuracy and timing results in the revision.
Although FRC & RerF outperform FRI under certain assumptions, it is clear than any axis-oblique method will lose one of the most appealing properties of FRI: unit and scale invariance. While Breiman did not propose an approach for mitigating this problem, we have proposed converting to ranks as a pre-processing step.
In the revision, we have modified and extended the experiments transforming the data, in particular with regard to scale, to point out that our RerF(rank) is significantly more robust to these transformations than RerF and FRC.
mine is a bit too long, so maybe you can shorten? i like mine better, but i'm not convinced it is. one point: you make comments that will be of general interest in response to 1 of the reviewers, but they should be made to everyone perhaps?