cstjean / ScikitLearn.jl

Julia implementation of the scikit-learn API https://cstjean.github.io/ScikitLearn.jl/dev/
Other
546 stars 75 forks source link

Adding native julia implementations #48

Open davidbp opened 5 years ago

davidbp commented 5 years ago

Hi,

Do you have any guidelines for adding new estimators?

I have implemented a multiclass perceptron

perceptron implementation

I would like to know how I could adapt it and leave it in this repo.

This is different from the perceptron in scikitlearn (python) since this implementation natively supports multiclass (it's not using a one vs all). The main benefit is that this is faster than the sklearn version.

I looked at the native linear_regression implementation. It seems you solved directly with this line

results = [ones(size(X, 2), 1) X'] \ y'

so this is not done using SGD. Should we implement abstraction to be reused to do partial_fit ?

cstjean commented 5 years ago

Do you have any guidelines for adding new estimators?

If the code depends on significant external packages, then it should be a separate package that imports ScikitLearnBase, but that's not the case with your code. Besides that, it should comply with the scikit-learn API of course, and it should be tested.

I would like to know how I could adapt it and leave it in this repo.

Sure!

so this is not done using SGD. Should we implement abstraction to be reused to fo partial_fit ?

Yes, that would be nice. But aren't there already libraries for doing SGD in Julia? I wouldn't mind adding such a dependency, if it helps to minimize the code we have to maintain.

cstjean commented 5 years ago

Also FYI, I haven't developed this package recently, and some people have reported issues with testing. If you do a PR, just make sure that the relevant tests pass, but don't worry about other issues.