Fig 3 shows a simulation result analyzing the bias-variance tradeoff for CART with and without HS. Here, data is generated from a linear model with Gaussian noise added during training (see Appendix S3 for experimental details, and other simulations). While predictive performance curves are often U-shaped because of the bias-variance tradeoff, those for HS are monotonic since HS is able to effectively reduce variance. The optimal regularization parameter λ decreases with the total number of leaves; this is corroborated by our calculations in Sec 3.