For fast prototyping, a smooth and flexible representation of functions is essential. Traditional approaches using trees or forests for function representation typically result in a piecewise constant output, which is a significant limitation.
To achieve a smoother representation, we propose data augmentation by randomly smearing the input vector of explanatory variables (X_n) with a user-defined kernel function (default is Gaussian), denoted as (W_n).
Three functionalities should be implemented:
Training Augmentation: Each tree in the forest should be augmented using a random vector (E_n), enhancing the diversity and robustness of the model.
Smoothed Mean: Calculate a weighted mean of the tree outputs in the local neighborhood to produce a smoother result.
Statistical Analysis of Predictions: Provide functionality to calculate various statistics from different tree predictions, including weighted mean, standard deviation, median, and linear fits (possibly enhanced with kernel methods).
Note: It is unclear whether the scikit-learn trees can provide information about the "box" defining cube properties. I would need additional investigation to figure out if this aspect is feasible.
Augmented Random Forest with Kernel Convolution
For fast prototyping, a smooth and flexible representation of functions is essential. Traditional approaches using trees or forests for function representation typically result in a piecewise constant output, which is a significant limitation.
To achieve a smoother representation, we propose data augmentation by randomly smearing the input vector of explanatory variables (X_n) with a user-defined kernel function (default is Gaussian), denoted as (W_n).
Three functionalities should be implemented:
Training Augmentation: Each tree in the forest should be augmented using a random vector (E_n), enhancing the diversity and robustness of the model.
Smoothed Mean: Calculate a weighted mean of the tree outputs in the local neighborhood to produce a smoother result.
Statistical Analysis of Predictions: Provide functionality to calculate various statistics from different tree predictions, including weighted mean, standard deviation, median, and linear fits (possibly enhanced with kernel methods).