natekupp / ffx

Fast Function Extraction
http://trent.st/ffx
Other
80 stars 93 forks source link

Fails to identify y=x^2 #11

Open jmmcd opened 10 years ago

jmmcd commented 10 years ago

Added a test [5510110672e86ddd5620b432574af6e0c7136a8a] with data y=x^2 for x=[0, 1, 2, 3]. FFX does pretty well, but I expected the exact relationship. In the 4-base model, there are two x^2 terms -- could this be related to the handling of second-order bases mentioned in #5 ?

Num bases,Test error (%),Model 0, 62.4453, 3.50 1, 11.4284, 0.640 + 0.817_x^2 2, 1.6635, 0.0846 + 0.972_x^2 + 0.00984_x 4, 0.7507, (0.0973 + 0.523_x^2 + 0.440_x^2) / (1.0 - 0.00214_x - 0.00168*x)

jmmcd commented 10 years ago

Maybe this is not the fault of FFX. The elastic net just doesn't seem to do well modelling this type of data (no noise, simple input-output relationship). Here is a test:

http://stackoverflow.com/questions/22738879/should-elastic-net-regression-be-able-to-regress-y-x-perfectly

jmmcd commented 10 years ago

The two x^2 terms seem to be coming from passing the base x into a model, which allocates it to both numerator and denominator. When it comes back, x is collected twice. It goes:

FFXModelFactory.build -> _basesToModels -> _pathwiseLearn ->  _allocateToNumerDenom
jmmcd commented 6 years ago

The failure to find the exact model is expected, but the multiple occurrences of x^2 seems to be a flaw, so I'll leave this issue open but rename it.