NicMcPhee / XO-bias-study

Results and write-up of our genetic programming crossover bias study
MIT License
0 stars 0 forks source link

Try to generate some simple generalization data #44

Open NicMcPhee opened 9 years ago

NicMcPhee commented 9 years ago

I don't know if we'll have time for this, but it would be really nice to try to get some simple generalization information on, for example, the Pagie-1 problem and maybe something like US Change. Does XO bias affect generalization? It would be very nice to know, and it's entirely possible that it does Very Bad Things on generalization. It's also the kind of thing that a reviewer might legitimately ding us on.

We do have the (LISP-like) trees from the runs, and we could use something like Clojure to evaluate those on some new points fairly easily if someone wants to do a little programming.

NicMcPhee commented 9 years ago

This came up at today's research meeting at Hampshire, and I did a little work on it afterwards. There's now a Pagie_1_generalization.clj file in the GDrive results folder that has some Clojure code for computing validation error on an (evolved) function.

I then went and pulled up the the most fit run Pagie-1 using the koza2 function set, Tarpeian bloat control, full bias, large population, 0.1% elitism, and binary tournaments. I choose this because all those runs were approximations (the best has 568 hits out of a possible 676, and there were no exact matches for the koza2 function set), but had quite low training error. This most fit individual, for example, had a training error (sum of absolute error over 676 test cases) of 3.611926, which is an average error of 0.005 per test case, which is pretty darn close on average.

I then generated 676 random points in the range of test values ([-5, 5]x[-5, 5]) and computed the total absolute error on those points. It varies depending on the specific random numbers chose, but it tends to be in the 50-ish range. That's quite a bit higher (more than a factor of 10) than the training error, but still a quite low per test average error of around 0.07. So if we needed an absolutely exact match, we definitely didn't get that. But if we wanted a good approximation that generalized reasonably to unseen data, I'd say we're good.

This is just for the winner from one run from one set of parameter choices, but it's consistent with a little poking I did on generalization for this problem earlier, so I'm not overly worried.

Do we want to collect data from some other runs? It wouldn't be all that hard, but it would take a few hours. Or do we just want to mention this in passing (maybe even a footnote)? Or not both mentioning it at all?

NicMcPhee commented 9 years ago

For the record, the evolved individual in question is

(defn out33 [x1 x2]
  (+ (Math/sin (+ (Math/sin (Math/sin (Math/sin (rlog x1))))
                  (Math/cos (Math/cos (% (rlog x1)
                                         (+ (Math/sin (Math/sin (rlog x1)))
                                            (Math/cos (Math/cos (rlog x1)))))))))
     (Math/sin (+ (Math/sin (Math/sin (rlog x2)))
                  (Math/cos (Math/cos (Math/sin (% (rlog x2)
                                                   (rlog (+ x2 x2))))))))))

Given that the target is it's pretty crazy that this pile of sines and cosines and logs gets anywhere close!