scikit-learn-contrib / py-earth

A Python implementation of Jerome Friedman's Multivariate Adaptive Regression Splines
http://contrib.scikit-learn.org/py-earth/
BSD 3-Clause "New" or "Revised" License
455 stars 122 forks source link

init parameter check_every not work #130

Open yc19920125 opened 7 years ago

yc19920125 commented 7 years ago

Hi: I am trying to use the parameter check_every to control my fitting time, but I found it didn't make any differences when I change the init parameter check_every. So I wonder whether the parameter check_every is useful now?

jcrudy commented 7 years ago

I would say it's not very useful for performance purposes. If you're looking for speed, I would suggest using the use_fast option.

yc19920125 commented 7 years ago

Thanks for your answer! So here comes a question: What does the parameter check_every exactly do. I found the fitting results(both predict accuracy and basis) are always the same when I change the parameter check_every.

jcrudy commented 7 years ago

I think if you set check_every sufficiently high, you should eventually see a worse fit (and maybe a very slight speed up). I think it speaks to the usefulness of check_every that I actually had to look in the code to make sure it's even still implemented. All it does is reduce the number of candidate knot locations during knot search. One reason to do this would be to make sure you don't end up with knots too close together. However, the minspan argument takes care of that in a much nicer way. I think I originally put check_every in in order to duplicate the behavior of the R earth package, which implements minspan the way py-earth implements check_every (or at least that was my understanding at the time). Basically, I'd say there's no good reason to use check_every at this point, unless you want to make your model worse for some reason (testing or something).