We may want to remove scaling entirely, but currently the predictions are not being rescaled (for any aggregation options when the forest is trained with scale = True).
A minimum reproducible example is as follows:
import numpy as np
import pandas as pd
from sklearn.datasets import load_iris
from sklearn.metrics import mean_squared_error
from random_forestry import RandomForest
data = load_iris()
X = pd.DataFrame(data["data"], columns=data["feature_names"])
y = data["target"]
# Create a RandomForest object
fr = RandomForest(ntree=100, max_depth=5, seed=1,oob_honest = True,scale=False)
fr.fit(X.iloc[:, 1:], X.iloc[:, 0])
print("Aggregation = average")
print(np.sqrt(mean_squared_error(X.iloc[:, 0],fr.predict(X.iloc[:, 1:], aggregation="average", exact = True))))
# Predictions are not on the scale of 1st column of iris, should be scaled + centered
print(fr.predict(X.iloc[:, 1:], aggregation="average", exact = True)[[0,1,50,51,100,101]])
We may want to remove scaling entirely, but currently the predictions are not being rescaled (for any aggregation options when the forest is trained with scale = True).
A minimum reproducible example is as follows: