thoughts on refining support calculations

Description From Bob Marinier 2008-08-24 14:21:27 (-) [reply]
In o-support mode 4 (the default), support for a wme is determined like 
this:

If a rule does not test a selected operator on the LHS, then all wmes it
creates get i-support.  If a rule does test a selected operator on the LHS,
then all wmes it creates get o-support, unless the wmes it creates are on 
the
operator itself, in which case they get i-support (because they are 
elaborating
the operator).

There are a couple problems with this:

1) An operator elaboration only gets i-support if it is directly on the
operator.  Substructure further down does not get i-support.  This often 
leads
to rules that have mixed support (and which should then default to i-
support),
like this:

sp {mixed-support
(state <s> ^operator <o>)
-->
(<o> ^i-supported.o-supported mixed)
}

The attributes of the wmes on the RHS above say what support they should 
get
according to the o-support mode 4 rules.  But the way support calculations
work, all wmes created by a production must have the same support, so we 
get a
warning message every time this rule fires saying that it's getting i-
support. 
In this case, that may be what we want.  But consider this:

sp {mixed-support2
(state <s> ^operator <o>)
-->
(<o> ^i-supported true)
(<s> ^o-supported true)
}

In this rule, these will both be i-supported again.  Arguably this is a 
"bad"
rule, but that doesn't mean the behavior should be bizarre.

2) A rule that modifies a superstate can create wmes that get two different
supports because the rule and the justification are different.  For 
example:

sp {operator-no-change
(state <s> ^name whatever
           ^superstate <ss>)
-->
(<ss> ^support both)
}

This rule has i-support (there is no operator tested), but if it's firing 
in an
ONC impasse, then the justification will test an operator (which is 
presumably
responsible for the name of the state), and thus the justification will get
o-support.

The reverse can also occur:

sp {operator-no-change2
(state <s> ^operator <o>
           ^superoperator <so>)
-->
(<so> ^support both)}

This rule has o-support, but it is elaborating an operator in the 
superstate,
so the justification will have i-support.  This is a more serious issue,
because if chunking is on, then the wme will have o-support when the 
substate
exists (since o-support "wins"), but i-support when chunks prevent the 
substate
from existing.

In both of these examples, o-support "wins" so the wmes will be o-
supported.

**********************
Addressing the issues:

I won't claim to be able to completely address these issues, but we should 
be
able to make some headway.

When we consider changing support calculations, we need to be cognizant of 
a
few constraints:

* support must be learnable by chunking
* computing support should be cheap
* support of a result should be the same when it is created from a subgoal 
or
from a chunk that replaces that part of the subgoal (note this is violated 
by
operator-no-change2 above).

The existing support calculations almost achieve this, so what I will 
describe
is merely a modification of the existing mechanism, not a wholesale
replacement.

First, consider the deep operator elaborations (which currently get o-
support).
 Conceptually, as operator elaborations, they should get i-support. 
Unfortunately, determining whether something is attached to an operator in
general is expensive (it requires computing the transitive closure), and if 
a
state is on an operator, then the implication would be that everything on 
the
state is i-supported.  And this still doesn't address the issue of a rule 
that
elaborates an operator and changes the state.

Instead, I argue for two principles: per-wme support calculations instead 
of
per-rule, and support determination based on the structures described in 
the
rule, not the entire structure of the state (perhaps call this "local 
support
determination").  In other words, if the LHS of a rule tests a selected
operator, wmes on the RHS of a rule that are connected to an operator *via
structure in the rule* should get i-support, and wmes that aren't get
o-support.  For the rule "mixed-support" above, both would get i-support 
(and
there'd be no warning).  For the rule "mixed-support2", the wme on the 
operator
would get i-support and the wme on the state would get o-support.  In 
general
this may still require a transitive closure calculation, but only over a 
rule,
which is typically small, so this should be cheap.

How would this situation be handled?

sp {mixed-support3
(state <s> ^operator <o>)
-->
(<o> ^i-support <x>)
(<s> ^o-support <x>)
(<x> ^support unknown)
}

There are a couple possibilities.  We could give precedence to either 
operator
elaboration or state changes, or we could give both supports (which is
equivalent to defaulting to o-support).  In o-support mode 4, everything in
this rule would get i-support (which definitely doesn't seem right).

Second, let's consider the superstate support issue.  We get multiple
conflicting supports, which means o-support wins.  An alternative would be 
to
use a "waterfall" model of support calculation: rules in superstates take
precedence over rules in substates when determining support.  Thus, the wme
created by the rule "operator-no-change2" above would be i-supported (since 
the
justification would take precedence).  In that case, it seems like what was
intended, and is what would happen with chunking (e.g., if the substate was
never created).

Finally, I think a lot of the confusion about what support a wme has and 
why
can be alleviated with better tools.  Currently you can print the support a 
wme
has and what rule it came from, but mixed support is not shown and the 
substate
in which a rule fired is not known (which is especially important for 
floating
rules).  Additionally, justifications are often no longer around, so they 
can't
be printed (this accounts for more frustration than anything, I think).
------- Comment #1 From Bob Marinier 2008-08-30 13:08:58 (-) [reply] ------
-
I'm just logging the comments people emailed on this topic (shown in
chronological order):

From: Randy Jones on 8/24/08

I agree that all the past attempts at defining support for rule actions 
have been confusing, which I assume is re-emphasized by the fact that 
we're now apparently talking about O-support-mode *FIVE*.  My own 
opinion is that the entire model for computing O-support has been 
fundamentally wrong from the start, implying that the new proposal will 
just lead to yet another confusing set of rules to choose from.

My opinion is that a Soar programmer *knows* when they write their Soar 
rules what kind of support they would like to be getting out of that 
rule, but they are then currently forced to translate that desire 
through the obtuse set of rules of their choice (mode 1, 2, 3, or 4), 
often also having to change their representations, in order to get their 
code to behave the way the want (and this task is even more important 
and difficult if the program is using subgoals).  And if they neglect to 
do this step, or do it incorrectly, or switch O-support modes, then they 
get bugs...often bugs that are very difficult to diagnose (especially 
when justifications are involved).

I continue to believe that Soar programmers should simply be able to 
specify explicitly in the rule which type of support they want each 
structure to get (ala :i-support or :o-support flags), and then the 
justifications/chunks would simply use the same explicit marking.  The 
"traditional" approach to i-support and o-support, to my mind, adds 
*nothing* in terms of functionality, and only makes Soar programming 
more difficult than it has to be.  The main arguments I've heard in the 
past against this approach is the desired to be able to control the 
learning of elaboration rules and application rules.  But I have never 
found those arguments compelling.  In the approach I propose, it's 
*easy* to control the learning of elaboration rules and application 
rules, because you make it explicit in the rules that return the results 
(instead of being forced to predict what your justifications/chunks are 
going to look like and whether they're going to meet the support rules 
you've chosen in the way you want).

From: Bob Marinier on 8/24/08

I think what you're suggesting is interesting, but it's also very
de-stabilizing to the code (both kernel and agent) without knowing if it 
will
really work.  If you/SoarTech/someone were willing to implement this (we 
can
call it o-support-mode -1 or something) then we can do direct comparisons 
on
real code.  What I'm suggesting is a much more incremental change to the
current code (well, probably -- it's hard to say for sure until someone 
dives
into the code), and thus will probably be less disruptive (not that this 
change
is actually on the table.  I was just thinking about it and wanted to write 
my
thoughts down).

From: Doug Pearson on 8/25/08

Another way to come at this would be to observe that a lot of these problem
rules are bad rules and so maybe the right approach is to only allow 
properly
defined PSCM operations in rules.  If a rule doesn't fall into an 
appropriate
PSCM category then it would be invalid.  Then each PSCM operation would be
associated with some well defined support (based on the current theory).

That would sort of be what Randy is proposing because you'd be picking the
support by picking the PSCM operation.  You could even make that explicit 
in
the syntax :operator-elaboration, :operator-application etc so the parser 
could
check that your rule did what you expected.

OK this is way too radical to actually be that serious of a suggestion, but 
one
thing I wish I'd learned quicker from John is that he only ever writes 
small,
clean little productions that do one thing (right John!).  And if you write
your Soar code like that in the end you get a lot better results.  And 
looking
at each rule as a single PSCM operation would help ensure that.  It would 
also
push the theory hard because if limiting a rule to a single PSCM operations
prove too limiting or incorrect in some way then it would raise the 
question of
why is that hard and what should change to make it easier.

P.S. Bob that's what you get for stirring the o-support pot :)

From: Bob Marinier on 8/25/08

I like the notion of thinking about this in terms of the PSCM.  Arguably, 
this
is exactly what the current approach does, with the caveat that it tries to
detect the PSCM function from the structure of the rule.  As you say, this
often results in ambiguous cases because people can write "bad" rules. 
Furthermore, support can't reliably be determined until runtime because we
don't know what variables will resolve to.  And this seems to be the major
issue -- as it currently stands, support can't be determined until runtime,
whereas these alternative approaches determine support at write-time (for 
lack
of a better word).

The problem here is that there are certain kinds of rules whose function 
cannot
be determined until runtime, and they are not necessarily bad rules.  For
example, learning from instruction, as I understand it, requires variable
attributes.  If you have variable attributes, that variable could resolve 
to an
operator, and thus affect the support of the rule.  I suppose we could 
avoid
this by saying that all variable attributes have an implied "<> operator" 
test
on them.  In general, though, it's still impossible to determine at write-
time
(from the syntax of a rule) what support a subgoal result should get unless 
we
eliminate elaborations that copy substructure down.  For example, instead 
of
referencing "superoperator", you would have to directly reference
"superstate.operator" so the rule knows it's an operator.  This would 
result in
more verbose rules (but it may still be the right thing to do).

At this point you're probably thinking that automatically detecting support
from rule structure isn't worth it, and we should just let the programmer
prescribe the support.  The thing that rubs me the wrong way about that is 
it
implies extra information associated with each rule.  This isn't 
necessarily
bad, but I'm a bit worried about how that might work out neurologically
(perhaps I'm alone in still caring about how Soar maps on to humans).  
Also, in
Doug's PSCM case, there's no way to ensure that the operation claimed by 
the
rule is actually accurate until runtime -- for example, a rule that is 
labeled
as operator elaboration may not even test an operator.  So then we would 
have
runtime failures, which is bad (I suppose we could just warn instead of 
fail).

Original issue reported on code.google.com by voigtjr@gmail.com on 23 Jul 2009 at 5:02

From: Doug Pearson on 8/26/08 I'm agree with you Bob that I'm not really happy with the idea that the programmer defines the support, although I can certainly see the practical benefits of what Randy is suggesting. But I do think it's reasonable that the programmer defines the PSCM operation (either implicitly -- have Soar figure it out -- or explicitly -- you define it). Or put another way it would be interesting to see which rules are (a) hard to determine the operation [variable attributes are an obvious example] and (b) which rules don't actually map to a PSCM operation at all. If those sets are reasonably small then explicit definition might not be such a big deal as it would be uncommon. We've generally taken the stance that any syntactically valid rule should be accepted and processed and that gives a lot of flexiblity in the language but perhaps it's a case off enough rope to hang yourself? Of course if you can write PSCM operations in a subgoal that produce a non-PSCM chunk then there's really no way around allowing it. But I'd have to think harder about this than I'm inclined to do right now to see if that's possible. From: Karen Coulter on 8/26/08 I didn't read through all the details of these emails, but I'm pretty sure that if you check the Soar rule parsing, Randy's desire to specify support on a per rule basis is already available. It used to be that users could add :i-support or :o-support right before the LHS to force Soar to categorize a rule at load time. I'm pretty sure that those flags override any o-support mode setting. It was considered a hack, I guess, but it's been there since Soar 6. At that time, all matched rules fired in each elaboration cycle -- there was not the difference between elab and apply, so not sure what weird behavior would happen now that elab and apply happen at different points. The run-time assignment of support is incredibly complex and fairly error- prone especially for objects with lots of structure. Randy illustrates that point quite well. Variables and how they bind make it even more confusing. From: Bob Marinier on 8/26/08 :i-support and :o-support still work, but Soar does not currently support assigning support to learned chunks the way Randy wants to. ------- Comment #2 From Bob Marinier 2008-09-29 10:42:53 (-) [reply] ------ - Updating comments on o-support discussion from ~1 month ago: From: Randy Jones on 8/30/08 Part of the question, though, is what are the PSCM operations and who defines them? Were "operator elaborations" a valid PSCM operation in the versions of Soar in which it was impossible to create operator elaborations? So if we allow ourselves to say that the PSCM includes "operator elaborations", "operator applications", "state elaborations", and "state applications", then my proposed approach magically becomes okay, right? (Although I'd be inclined to divide that group of 4 into a group of two instead: "elaborations" and "applications".) Then I suppose you would also throw in "operator proposals" and "operator comparisons", and then require that those be i-supported (which, if I remember correctly, has not traditionally been the requirement in the Soar implementations, although I believe that's been the convention). But I'll also note that if you did it this way (that is, by making my proposed changes to the PSCM, including forbidding o-supported operator proposals and comparisons), I believe it would become impossible to write or learn rules that are outside the PSCM. From: Doug Pearson on 8/30/08 I'm not sure who would define the PSCM operations but if as you said there was a set which was closed over chunking and where support was clearly defined for each operation (and rules that tried to be two operations at once were invalid in some manner) then I think that would be great, whatever syntax/method was used to mark them. I'm not really arguing for what those operations should be, but I think defining the problem in those terms is possibly a better way to go than focusing on the mechanism itself. It's a bit like chunking. If you try to think about it as the operation (backtracking from a result through rule firings to a set of wmes in the superstate) your head just spins. But if you think about which elements of the super state should be referenced at all in the subgoal's problem solving it becomes tractible. In the same way if we spent less time thinking about rules and support and more about PSCM operations (including arguing for what the correct set of operations is) it might lead to better system design and ultimately less confusing bugs. From: John Laird on 8/31/08 Along the lines of what Doug said, it seems to be a bit arbitrary to consider the PSCM role of a new structure is a superstate to be based on the *last* inference that is made in a substate. It also seems to be very challenging to get closure over chunking for a scheme based on the last inference. For example, it is hard to see how a chunk could be learned that has the PSCM role of operator elaboration (because the result would have to be created by an operator elaboration in the substate). What the current scheme tries to do is capture the role of the complete processing in the substate – what is tested and what is modified. I admit we haven’t gotten it right yet, but that doesn’t mean there isn’t a solution along that path. From: Randy Jones on 8/31/08 I'm a little confused by your example. Under the current scheme, a learned chunk would be an operator elaboration only if the result being returned is being attached directly to a super-operator, right? And that result can *only* be returned by the last inference in the substate, because that's when the chunk is created. So the PSCM role of an operator elaboration chunk is *already* based on the last inference that is made in the substate. I haven't yet seen an example where it would be challenging to get closure over chunking using a scheme based on the last inference. So what am I missing? My basic point is that it seems like the current approach wants to allow the possibility that the PSCM role of a chunk is somehow a "side effect", but in practice nobody does it that way (not to mention the evils of side effects from an engineering perspective)...at least in my experience, everybody knows what kinds of chunks they want to get when they write their code, and then they have to figure out a way within the current implementation to accomplish that. In my proposed scheme that's easy, because you just tell Soar what you want it to be when you return the result (and you also get no conflict between the support of the original result and the support of the chunk)...in all of the past O-support schemes there have been at least certain kinds of chunks that you have to do twisted things to try to get what you want out of the system. From: John Laird on 8/31/08 My example about operator elaboration was to make a point about closure over chunking, not about the last inference. I still am curious how you would achieve closure for operator elaboration. I do think there are issues with last inference and here are two examples. Consider using an operator in the selection space to compare two superoperators and then generate a preference. If you make the operator application i-supported in the substate so the preference is i-supported, that risks having a flickering elaboration (going in and out) if the result doesn't terminate the substate and if the operator application tests for the absence of the preference (which applications often do). I have used operators for this in extensions to the selection space and it works in the current system. Alternatively, consider using a state elaboration in a substate to do the final step in operator application in a superstate. I guess you could label the state elaboration as an operator application, but there is no guarantee in Soar that the result is always part of an operator application. It could sometimes be part of what has traditionally been state elaboration (depending on what is tested by prior processing in the substate) where the correct behavior is for the result to retract when the reasons for its creation are removed from working memory. One of the strengths of Soar is it can reuse a problem space in different ways depending on the context. I guess a lot of this discussion arises because you and I have two fundamentally different views of the use of Soar. In my view, as we progress more and more toward general human-level behavior, I think we need to get the programmer out of the way and have most of the creation of rules happen through chunking (and possibly other learning mechanisms if chunking is insufficient). Thus, there shouldn't be a programmer in the background that "knows" what kind of chunks they want. Maybe this means that the direction I want Soar to go in doesn't make it the best tool for knowledge engineering applications. I also think part of this has to do with styles of writing Soar code. I think approaches where you have to twist your programming means that a different approach should be investigated (as opposed to finding easier ways to do the twisting). That is my experience when talking to students who have had issues with the current support mechanisms using chunking (modulo the issues that Bob brought up in his original email). From: John Laird on 9/1/08 After reading Randy's email again and thinking about this some more, I believe it does come down to a difference in philosophy about language design and the purpose of Saor. The key point probably is about substates and results. One of the distinguishing characteristics of Soar is that results are a side-effect of processing in a subgoal. I've always thought that once a subgoal is set up, it just goes off and does its thing. There is no explicit return of results - although sometimes some of the structures are obviously connected to the superstate (such as operator preferences), but not always. In terms of processing in the substate, structures that become results aren't "special" and sometimes a given structure might be a result, but other times, depending on the structure of the substate, it might not be a result. This makes it possible to do some pretty interesting things in substates that would not be possible if results had to be tagged. This contrasts with most programming languages that try to avoid global data and side effects. In those cases, the programmer does "control" when a result is created, and knows what the result structure should be. Thus, I would think that under that model, controlling the support would make sense. So, if Soar is being used as an advanced programming language, I can see Randy's point. And in the design of Zoom - if they include impasses/substates/results, it is probably appropriate to have explicit results with explicit support. But the goal of Soar is not to be an advanced programming language and so it is appropriate for some of these design decision to be different. From: Randy Jones on 9/1/08 Thanks for your further thoughts on this, John. I guess my opinion might boil down to this: The ability to create the kinds of "possibly interesting side effects" that you describe is *exactly* what causes the kinds of mixed-support problems that were identified in the email that started this thread. So I don't believe it will be possible to maintain the ability to keep these kinds of side effects but also get rid of the potential for these kinds of problems. Put another way, I believe that if you ever do find a way to solve the problems, you will discover that you have effectively eliminated the possibility of having the side effects. The reason I fall on the side of the fence I do is that I've never seen any compelling examples of the usefulness of these particular kinds of side effects, but I've seen lots of cases where the support-mixing problems get in the way of a clean design.

sleyzerzon / soar

thoughts on refining support calculations #38