Closed ProtossidoDiAzoto closed 6 months ago
Thanks for reporting. But I cannot reproduce the issue. Here is the output of your code in jshell:
jshell> smile.regression.RandomForest model = smile.regression.RandomForest.fit(formula, data, 100, 3, 20, 10, 3, 1.0, Arrays.stream(seeds));
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: 92.68%
[ForkJoinPool.commonPool-worker-5] INFO smile.regression.RandomForest - Regression tree OOB R2: 92.50%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 95.29%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: 45.31%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 72.80%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: 96.00%
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: -52.84%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: 79.68%
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: 84.25%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: 80.84%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: 94.74%
[ForkJoinPool.commonPool-worker-5] INFO smile.regression.RandomForest - Regression tree OOB R2: 81.39%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: 52.97%
[ForkJoinPool.commonPool-worker-5] INFO smile.regression.RandomForest - Regression tree OOB R2: 19.96%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: 94.88%
[ForkJoinPool.commonPool-worker-5] INFO smile.regression.RandomForest - Regression tree OOB R2: 77.90%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: 80.58%
[ForkJoinPool.commonPool-worker-5] INFO smile.regression.RandomForest - Regression tree OOB R2: 65.82%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: -0.68%
[ForkJoinPool.commonPool-worker-5] INFO smile.regression.RandomForest - Regression tree OOB R2: 85.60%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: 94.41%
[ForkJoinPool.commonPool-worker-5] INFO smile.regression.RandomForest - Regression tree OOB R2: 86.14%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: 94.58%
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: 69.15%
[ForkJoinPool.commonPool-worker-5] INFO smile.regression.RandomForest - Regression tree OOB R2: 47.19%
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: 64.80%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 50.14%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: 91.10%
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: 66.80%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 82.73%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: 71.68%
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: 71.86%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 82.10%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: 37.68%
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: 90.01%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 44.82%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: 89.69%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 89.96%
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: 86.88%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: 68.18%
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: 80.44%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: -67.39%
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: 83.29%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 68.86%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: -52.48%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 69.99%
[ForkJoinPool.commonPool-worker-5] INFO smile.regression.RandomForest - Regression tree OOB R2: 52.94%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 88.30%
[ForkJoinPool.commonPool-worker-5] INFO smile.regression.RandomForest - Regression tree OOB R2: 76.42%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 46.09%
[ForkJoinPool.commonPool-worker-5] INFO smile.regression.RandomForest - Regression tree OOB R2: 88.84%
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: 94.47%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 90.22%
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: 44.55%
[ForkJoinPool.commonPool-worker-5] INFO smile.regression.RandomForest - Regression tree OOB R2: 73.57%
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: 71.46%
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: 56.63%
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: 94.19%
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: 85.09%
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: 48.94%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: 59.37%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 86.04%
[ForkJoinPool.commonPool-worker-5] INFO smile.regression.RandomForest - Regression tree OOB R2: 91.90%
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: 77.22%
[ForkJoinPool.commonPool-worker-5] INFO smile.regression.RandomForest - Regression tree OOB R2: 68.18%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 72.97%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 42.43%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 88.18%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 89.88%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 75.93%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 63.56%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 82.64%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 79.55%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: -592.92%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: -40.05%
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: 92.73%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 72.49%
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: 80.21%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 75.78%
[ForkJoinPool.commonPool-worker-5] INFO smile.regression.RandomForest - Regression tree OOB R2: 81.12%
[ForkJoinPool.commonPool-worker-5] INFO smile.regression.RandomForest - Regression tree OOB R2: 81.77%
[ForkJoinPool.commonPool-worker-5] INFO smile.regression.RandomForest - Regression tree OOB R2: 84.95%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: 84.55%
[ForkJoinPool.commonPool-worker-5] INFO smile.regression.RandomForest - Regression tree OOB R2: 77.15%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: 44.95%
[ForkJoinPool.commonPool-worker-5] INFO smile.regression.RandomForest - Regression tree OOB R2: 81.46%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: 42.47%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: 79.70%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: 76.79%
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: 16.79%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: 13.83%
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: 89.02%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: 83.28%
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: -147.08%
[ForkJoinPool.commonPool-worker-4] INFO smile.regression.RandomForest - Regression tree OOB R2: 50.39%
[ForkJoinPool.commonPool-worker-6] INFO smile.regression.RandomForest - Regression tree OOB R2: 30.56%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 94.31%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 91.06%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 58.76%
[main] INFO smile.regression.RandomForest - Regression tree OOB R2: 55.30%
Hi! I run into similar issue when trying to upgrade from 2.6.0 to 3.1.0. Predict requires DataFrame that contains the predicted variable. Code below works in 2.6.0, but not 3.1.0. ` import org.junit.Assert; import org.junit.Test; import smile.data.DataFrame; import smile.data.formula.Formula; import smile.data.vector.DoubleVector; import smile.regression.LinearModel; import smile.regression.OLS;
public class TestSmileRegression {
@Test
public void test_formula_OLS() {
double[] x = {1, 2, 3};
double[] y = {1, 2, 3};
DataFrame df = DataFrame.of(DoubleVector.of("x", x),
DoubleVector.of("y", y));
LinearModel regr = OLS.fit(Formula.lhs("y"), df);
double[] x_pred = {4,5,6};
double[] y_pred = regr.predict(DataFrame.of( DoubleVector.of("x", x_pred)));
for(int i=0; i<x_pred.length; i++) {
Assert.assertEquals(x_pred[i], y_pred[i], 1e-9);
}
}
}`
Exception: `java.lang.IllegalArgumentException: Field y doesn't exist
at smile.data.type.StructType.indexOf(StructType.java:103)
at smile.data.formula.Variable$1.<init>(Variable.java:80)
at smile.data.formula.Variable.bind(Variable.java:78)
at smile.data.formula.Formula.bind(Formula.java:360)
at smile.data.formula.Formula.x(Formula.java:497)
at smile.data.formula.Formula.matrix(Formula.java:546)
at smile.regression.LinearModel.predict(LinearModel.java:358)
at models.TestSmileRegression.test_formula_OLS(TestSmileRegression.java:22)
`
yes exactly "predict requires DataFrame that contains the predicted variable" indeed I had solved the issue the past week by implementing the following solution:
@Test
public void tryOutRandomForestArrayData(){
MathEx.setSeed(19650218);
RandomForest model = RandomForest.fit(formula, data, 100, 3, 20, 10, 3, 1.0, Arrays.stream(seeds));
List<StructField> fields = Arrays.asList(
new StructField("GNP", DataTypes.DoubleType),
new StructField("unemployed", DataTypes.DoubleType),
new StructField("armed_forces", DataTypes.DoubleType),
new StructField("population", DataTypes.DoubleType),
new StructField("year", DataTypes.IntegerType),
new StructField("employed", DataTypes.DoubleType),
new StructField("deflator", DataTypes.DoubleType)
);
StructType st = new StructType(fields);
for (int i = 0; i < x.length; i++) {
Tuple param = Tuple.of(x[i],st);
System.out.println(model.predict(param));
}
}
Describe the bug I encountered an issue with the prediction method when attempting regression using Random Forest and Gradient Boost algorithms. The problem arises specifically in versions higher than 3.0.0. In version 2.6.0, this problem does not occur.
Reproduction Steps 1.Use the provided code snippets for setting up the regression. 2.Attempt to run regression using Random Forest or Gradient Boost with versions > 3.0.0. 3.Observe the error message mentioned below.
Code Snippet
Expected behavior The regression should execute prediction successfully without any errors, similar to the behavior observed in version 2.6.0.
Actual behavior Illegal argument exception is thrown :
Field deflator doesn't exist java.lang.IllegalArgumentException: Field deflator doesn't exist at smile.data.type.StructType.indexOf(StructType.java:103) at smile.data.formula.Variable$1.<init>(Variable.java:80) at smile.data.formula.Variable.bind(Variable.java:78) at smile.data.formula.Formula.bind(Formula.java:360) at smile.data.formula.Formula.x(Formula.java:433) at smile.regression.RandomForest.predict(RandomForest.java:455)
Additional context
Request for Assistance Could someone kindly provide insights into what might be causing this error? I'd greatly appreciate any guidance or suggestions for troubleshooting steps.