haifengl / smile

Statistical Machine Intelligence & Learning Engine
https://haifengl.github.io
Other
6.06k stars 1.13k forks source link

IllegalArgumentException:Field coalProduction doesn't exist #781

Closed byTianHai closed 4 months ago

byTianHai commented 4 months ago

Describe the bug

Exception in thread "main" java.lang.IllegalArgumentException: Field coalProduction doesn't exist   
at smile.data.type.StructType.indexOf(StructType.java:103)  
at smile.data.formula.Variable$1.<init>(Variable.java:80)  
at smile.data.formula.Variable.bind(Variable.java:78)  
at smile.data.formula.Formula.lambda$bind$12(Formula.java:348)    
at smile.data.formula.Formula.bind(Formula.java:349)   
at smile.regression.OLS.fit(OLS.java:121)  
at smile.regression.OLS.fit(OLS.java:106)  
at smile.regression.OLS.fit(OLS.java:85)  
at com.smile.LinerR.main(LinerR.java:31) 

Expected behavior

I am planning to perform a multi-objective regression analysis task. Based on smile, I plan to bind target variables and feature variables one by one to conduct linear regression analysis and achieve multi-objective regression. I'm having a few problems at the moment, my feature variable is coalProduction and my target variable is every variable in targetColumns, here's my source code

Actual behavior

OLS.fit(formula, dataSplits)throws an exception:IllegalArgumentException: Field coalProduction doesn't exist. After testing, my object file reads in normally, containing the feature variable coalProduction. So I don't understand this error.

Code snippet

DataFrame df= Read.csv("coalta.csv");
String[] targetColumns = {"drainage","industrialUsage","domesticUsage","waterTreatment","waterStorage","discharge"};
StringBuilder result = new StringBuilder();
for (String targetColumn : targetColumns) {
Formula formula = Formula.of(targetColumn,"coalProduction");
LinearModel ols = OLS.fit(formula, df);
result.append("Model for ").append(targetColumn).append(":\n")
.append("Coefficients: ").append(Arrays.toString(ols.coefficients())).append("\n")
.append("Intercept: ").append(ols.intercept()).append("\n\n");
}
System.out.println(result.toString());

Input data coalProduction,drainage,industrialUsage,domesticUsage,waterTreatment,waterStorage,discharge 1000,200,150,80,60,70,90 1010,202,152,82,62,72,92 1020,205,155,85,64,74,94 1030,208,158,88,66,76,96 1040,210,160,90,68,78,98 1050,213,162,92,70,80,100 1060,215,165,94,72,82,102 1070,218,168,96,74,84,104 1080,220,170,98,76,86,106

Additional context

haifengl commented 4 months ago

Set header=true for the format parameter of Read.csv. Or Read.data(path, “header=true”)