Look into proxy improvement of squared error

The Squared error seems to be performing quite a bit slower than that of sklearn. After looking into it a bit it seems they never calculate the actual squared error when a previous calculation has been done. Afterwards, they simply calculate:

" The MSE proxy is derived from

        sum_{i left}(y_i - y_pred_L)^2 + sum_{i right}(y_i - y_pred_R)^2
        = sum(y_i^2) - n_L * mean_{i left}(y_i)^2 - n_R * mean_{i right}(y_i)^2

    Neglecting constant terms, this gives:

        - 1/n_L * sum_{i left}(y_i)^2 - 1/n_R * sum_{i right}(y_i)^2

The split that maximizes this also maximises the impurity in the squared error.

NiklasPfister / adaXT

Look into proxy improvement of squared error #47