Open Vaguery opened 9 years ago
Yes, makes sense. I definitely agree with the goal of separation, and that refactoring is the right step.
global-pop-when-tagging is indeed about the interpreter -- it changes the effect of executing a tagging instruction (whether or not the act of tagging also pops the data item off of the stack on which it was found).
global-tag-limit is an interesting and confusing case. It affects only what tag-related instructions can be generated -- once generated, they will work without accessing the limit. It looks like the global limit is almost never used except in ERCs during code generation... and the functions there can instead take an argument for the limit (using the global only if it's not supplied), and it looks like much (all?) of the code base supplies an explicit argument. Something could definitely be cleaned up here.
Note also, though, that this issue, along with that of #125, shows that the distinction between parameters "about" the interpreter vs parameters "about" search is messy. The messy parts arise only for code that's manipulating code. They arise because we write code that generates and manipulates code when we write a GP system, and because Push programs can also generate and manipulate code. And sometimes (for "autoconstructive evolution") we want the Push programs to be manipulating code also in order to define, essentially, their own genetic programming system. In these cases the code generation/manipulation parameters affect both the interpreter and "search."
I think maybe the desirable behavior, insofar as it's "shared", would be to control of the arguments/parameters to the interpreter (the closest thing to the work at hand), and permit requests from a calling entity to override the defaults temporarily.
I note the same thing will probably have to happen to the instruction list, eventually.
FWIW, I think global-parent-selection
isn't used anywhere anymore, an artifact of an old problem file that has since been much improved. So, delete it.
Anyway, I'm all for a refactoring of the interpreter and globals. I can't tell you the number of times I've gone to run a program without resetting global-evalpush-limit
(or even worse and more confusing, global-max-points
) and not gotten the results I expected. It's not straightforward, so tread carefully.
On that note, it will probably be important to have some midje
tests in place before I do anything drastic. Will it be OK for me to add it as a dependency? I assume dependencies are downloaded but only used (in memory) when called?
I think it would be fine to add a midje
dependency. I'm not sure how they work, but we haven't run into problems with adding dependencies in the past.
I have a suspicion this is deeply related to the problem I was complaining about in #125:
I'm trying to do some exploratory work with some scripts I've come across (and also some testing), and I realize they're not running very long. As in the number of steps they need to "work" is more than the global setting of
@global-evalpush-limit
, which is 150 steps (?!).Now as I understand this, it's a global variable set in
pushgp/globals
, yes? So to affect the number of steps a program running in an interpreter takes, I need to change the global value beforehand and (assuming I don't want it permanently changed for subsequent runs) set it back manually afterwards?I'm looking at the list of globals in
pushgp/globals.clj
(right under the admonition "These definitions are used by Push instructions and therefore must be global") and have done some experiments:global-atom-generators
is used bycode_rand
onlyglobal-max-points
says it controls the size of things pushed to stacks, but it is only referenced by:code
instructions that return newly constructed values (e.g.,code_cons
), so it doesn't affect the initial script at all (just checked by hand)global-tag-limit
this seems to be about the interpreter, not evolutionary searchglobal-top-level-push-code
does actually affect the interpreterglobal-top-level-pop-code
dittoglobal-evalpush-limit
cannot be set or changed, except as apushgp
globalglobal-evalpush-time-limit
dittoglobal-pop-when-tagging
??? I think this is about the interpreter, not the search?global-parent-selection
actually does refer to genetic programmingglobal-print-behavioral-diversity
dittoThe problem
As far as I can tell, the Push interpreter is only slightly intermingled with concerns about GP search, except for the global arguments above. This makes it more difficult to explore different kinds of search without dragging ontological junk over from earlier algorithms. It also makes testing much more difficult, since a lot of dependencies (
pushgp
as such) come along for the ride when one should only be concerned about instructions, stacks, input/output, tagging and so forth.More importantly, the ability to parallelize the evaluation of Push programs (the slowest part of genetic programming, after all) is hindered by this leaky barrier between running and searching.
A refactoring path
I'd say this actually trumps the difficulties I noticed with reporting in
pushup
. I'd like to use the interpreter on its own for several research-related tasks, and I can't imagine teaching people about GP without better separating the concerns here.I'd like to try making a push interpreter that is a more self-contained construct, with its own persistent attributes that can be written and read by a mindful creator. I'd want to refactor this apart, not break anything. So it would involve keeping the same arguments there are now, but inserting into the interpreter new functionality that lets one speak directly to it as needed.
In other words,
pushgp
should see the same interface when getting and setting variables the interpreter should use, but there should be (at least for the interim) a parallel way of talking to a "bare" interpreter directly, without the intermediation of theclojush/globals
path.Does that make sense?