gyrdym / ml_algo

Machine learning algorithms in Dart programming language
https://gyrdym.github.io/ml_algo/
BSD 2-Clause "Simplified" License
184 stars 33 forks source link

Examples of configuration for LinearRegressor? #180

Closed chaschev closed 3 years ago

chaschev commented 3 years ago

Hey,

Thanks a lot for the library. Really impressed with how much you can do with dart!

Trying to run a linear regression for a simple line y(x) = x, found following issues which I suppose are due to configuration of the regressor. Please help to configure

The code below gives my expected result for most of the cases, with k around 1.00. However in some cases, i.e.

a=1 n=10 -> k (0.9994153380393982) rows ((9.994153022766113))
a=0 n=10 -> k (0.3038938045501709) rows ((3.038938045501709))
a=-10 n=10 -> k (0.5980027318000793) rows ((5.980027198791504))
a=1 n=100 -> k (NaN) rows ((0.0))  

the result is different. Is this because of the configuration?

Also is there a way to retrieve b from y(x) = kx + b?

Thank you!

import 'package:ml_algo/ml_algo.dart';
import 'package:ml_dataframe/ml_dataframe.dart';
import 'package:xrange/xrange.dart';

main() {
 var a = 1;
 var n = 100;

 var _data = NumRange.closed(a, n).values().map((it) => [it, it]) ;

 final data = [['x', 'y'], ..._data];

 print(data);

 final samples = DataFrame(data, headerExists: true);
 final regressor = LinearRegressor(samples, 'y');

 var prediction = regressor.predict(DataFrame([['x', 'y'], [10.0,]],));

 print("a=$a n=$n -> k ${regressor.coefficients} rows ${prediction.rows}");
}
gyrdym commented 3 years ago

@chaschev Hi, thank you very much for creating the issue! I'm sorry for such a big delay in my answer, apparently, there is something wrong with notifications in github, I didn't even notice your question. I'll take a look at the problem you faced with.

Regarding b term: it has its own coefficient, and since you have public coefficients field, you can easily access the coefficient of b, it's always the very first value (regressor.coefficients[0]), and regarding the initial value of b - you configure it by yourself, see interceptScale parameter of LinearRegressor constructor, by default it is 1. In order to get the full term, just multiply the coefficient mentioned above and interceptScale value.

gyrdym commented 3 years ago

@chaschev and one more thing: you need to specify 'fitIntercept' parameter, in order to include 'b' term into the equation

chaschev commented 3 years ago

@gyrdym Thank you for replying. It looks like only a small issue with the regressor

gyrdym commented 3 years ago

@chaschev hi, just for your information, I remember about the issue, but currently, I work on null-safety for ml_algo library, it is quite a big task, I need some time to complete it.

gyrdym commented 3 years ago

@chaschev the problem was with the initialLearningRate parameter. By default, it has too high value, the better value for your example is 1e-4: LinearRegressor(samples, 'y', initialLearningRate: 1e-4);