biolab / orange3-educational

🍊 🎓 Educational widgets for machine learning and data mining in Orange 3.
Other
27 stars 20 forks source link

By default, Polynomial Regression wrongly connects to Test and Score #149

Closed wvdvegte closed 1 year ago

wvdvegte commented 2 years ago
Educational version

0.5.0

Orange version

3.32.0

Expected behavior

Like other regression widgets, Polynomial Regression should be defaulted to send its Learner output to the Learner input of the Test and Score widget

Actual behavior

By default, it sends Coefficients to the Test Data input

Steps to reproduce the behavior

Connect the input of Polynomial Regression to a suitable dataset, and its output to Test and Score

janezd commented 2 years ago

This is trivial to fix, but I encountered a far worse problem: what you get on the output is not a learner with polynomial expansion, but a linear regression and, to make it even worse, without intercept. We never expected somebody would use this output. This widget was intended to be used to show linear regression in a plot, to show the effect of polynomial expansion, overfitting, regularization ...

I think it would be possible to provide a proper learner on the output, though. I'll try.

wvdvegte commented 2 years ago

@janezd, in the attached example, polynomial regression, 2nd degree, does show the same results in Test & Score as Curve Fit to p1 + p2*weight + p3*weight**2. This is as would be expected. Also, the coefficients output provides the expected number of coefficients, including the intercept. Would it mean that Curve Fit is also broken? Anyway, isn't the Polynomial Regression widget somewhat superfluous now that the same (and more) can be done with Curve Fit? Its only added value lies in the display of the regression curve (which is why you created it as an educational widget, I guess) polynomial regression test.ows.zip

janezd commented 2 years ago

The bug was introduced two months ago, in #136, so the stable version still works properly. Thanks for your workflow: I'll use it to re-check my fixes.

Yes, Polynomial regression is an educational tool. It is much simpler than Curve fit (both in terms of usage as in code) and more visual. Same for, for instance, k-means in Educational add-on.

wvdvegte commented 2 years ago

While you're on it, I just noticed that Polynomial Regression also connects wrongly to Explain Model, Explain Prediction and Explain Predictions.

EKal-aa commented 2 years ago

Hi there, I just looked into it because of a quenstion of Wilfred on discord. I found, that if you activate "Fit intercept" in the linear regression widget, no value for the intercept is shown in the coefficients output and if you connect the model output to a predictions widget, the predicted values don't match the target very well. But if you deactivate "Fit intercept" in the linear regression widget, a value for the intercept is shown in the coefficients output and if you connect the model output to a predictions widget, the predicted values match the target very well!

The output is a learner with linear regression (as the input learner), but also the output data is changed. The polynomial expansion (in the example flow below of degree 3) produces (out of one feature x) 4 features:

With this expanded features, the linear regression learner is correct. The "polynomial degree" in polynomial regression widget don't affect the learner, but affects the data.

The feature x^0 is always present regardless of the "Fit intercet" property of the linear regression widget. So, if "Fit intercept" is activated in the linear regression widget, somewhere in the code, a second column with ones is created for the intercept, I guess. In the polynomial regression widget, there is a "Fit intercept" property, too. But is is always activated.

polynomial-regression-test.zip