lspector / Clojush

The Push programming language and the PushGP genetic programming system implemented in Clojure.
http://hampshire.edu/lspector/push.html
Eclipse Public License 1.0
331 stars 94 forks source link

Interpreter-crucial settings can't be set except through `pushgp` #126

Open Vaguery opened 9 years ago

Vaguery commented 9 years ago

I have a suspicion this is deeply related to the problem I was complaining about in #125:

I'm trying to do some exploratory work with some scripts I've come across (and also some testing), and I realize they're not running very long. As in the number of steps they need to "work" is more than the global setting of @global-evalpush-limit, which is 150 steps (?!).

Now as I understand this, it's a global variable set in pushgp/globals, yes? So to affect the number of steps a program running in an interpreter takes, I need to change the global value beforehand and (assuming I don't want it permanently changed for subsequent runs) set it back manually afterwards?

I'm looking at the list of globals in pushgp/globals.clj (right under the admonition "These definitions are used by Push instructions and therefore must be global") and have done some experiments:

As far as I can tell, the Push interpreter is only slightly intermingled with concerns about GP search, except for the global arguments above. This makes it more difficult to explore different kinds of search without dragging ontological junk over from earlier algorithms. It also makes testing much more difficult, since a lot of dependencies (pushgp as such) come along for the ride when one should only be concerned about instructions, stacks, input/output, tagging and so forth.

More importantly, the ability to parallelize the evaluation of Push programs (the slowest part of genetic programming, after all) is hindered by this leaky barrier between running and searching.

A refactoring path

I'd say this actually trumps the difficulties I noticed with reporting in pushup. I'd like to use the interpreter on its own for several research-related tasks, and I can't imagine teaching people about GP without better separating the concerns here.

I'd like to try making a push interpreter that is a more self-contained construct, with its own persistent attributes that can be written and read by a mindful creator. I'd want to refactor this apart, not break anything. So it would involve keeping the same arguments there are now, but inserting into the interpreter new functionality that lets one speak directly to it as needed.

In other words, pushgp should see the same interface when getting and setting variables the interpreter should use, but there should be (at least for the interim) a parallel way of talking to a "bare" interpreter directly, without the intermediation of the clojush/globals path.

Does that make sense?

lspector commented 9 years ago

Yes, makes sense. I definitely agree with the goal of separation, and that refactoring is the right step.

global-pop-when-tagging is indeed about the interpreter -- it changes the effect of executing a tagging instruction (whether or not the act of tagging also pops the data item off of the stack on which it was found).

global-tag-limit is an interesting and confusing case. It affects only what tag-related instructions can be generated -- once generated, they will work without accessing the limit. It looks like the global limit is almost never used except in ERCs during code generation... and the functions there can instead take an argument for the limit (using the global only if it's not supplied), and it looks like much (all?) of the code base supplies an explicit argument. Something could definitely be cleaned up here.

Note also, though, that this issue, along with that of #125, shows that the distinction between parameters "about" the interpreter vs parameters "about" search is messy. The messy parts arise only for code that's manipulating code. They arise because we write code that generates and manipulates code when we write a GP system, and because Push programs can also generate and manipulate code. And sometimes (for "autoconstructive evolution") we want the Push programs to be manipulating code also in order to define, essentially, their own genetic programming system. In these cases the code generation/manipulation parameters affect both the interpreter and "search."

Vaguery commented 9 years ago

I think maybe the desirable behavior, insofar as it's "shared", would be to control of the arguments/parameters to the interpreter (the closest thing to the work at hand), and permit requests from a calling entity to override the defaults temporarily.

I note the same thing will probably have to happen to the instruction list, eventually.

thelmuth commented 9 years ago

FWIW, I think global-parent-selection isn't used anywhere anymore, an artifact of an old problem file that has since been much improved. So, delete it.

thelmuth commented 9 years ago

Anyway, I'm all for a refactoring of the interpreter and globals. I can't tell you the number of times I've gone to run a program without resetting global-evalpush-limit (or even worse and more confusing, global-max-points) and not gotten the results I expected. It's not straightforward, so tread carefully.

Vaguery commented 9 years ago

On that note, it will probably be important to have some midje tests in place before I do anything drastic. Will it be OK for me to add it as a dependency? I assume dependencies are downloaded but only used (in memory) when called?

thelmuth commented 9 years ago

I think it would be fine to add a midje dependency. I'm not sure how they work, but we haven't run into problems with adding dependencies in the past.